I am doing some manipulation on a data frame:
df
Node Interface Speed carrier 1-May 9-May 2-Jun 21-Jun
Server1 internet1 10 ATT 20 30 50 90
Server1 wan3.0 20 Comcast NaN NaN NaN 100
Server1 wan3.0 50 Comcast 30 40 40 NaN
Server2 wan2 100 Sprint 90 70 NaN NaN
Server2 wan2 20 Sprint NaN NaN 88 70
Server2 Internet2 40 Verizon 10 60 90 70
I need to merge rows in data frame group by Node and Interface, replace nan values with the other row and pick the max value for the speed for the interface.
expected data frame should be like this:
df1
Node Interface Speed carrier 1-May 9-May 2-Jun 21-Jun
Server1 internet1 10 ATT 20 30 50 90
Server1 wan3.0 50 Comcast 30 40 40 100
Server2 wan2 100 Sprint 90 70 88 70
Server2 Internet2 40 Verizon 10 60 90 70
I tried this:
df2=df.groupby(['Node','Interface','carrier']),agg({'Speep': 'max'}).reset_index()
df3=df.drop('Speed', axis=1)
df4=df3.ffill().drop_duplicates()
Not quite working. Is there an easy way to merge rows, replace nan values with the other row values and pick the max Speed for the Speed cell value?