Hope some one can help me with this. I have several dataframes that contains a value a corresponding id and name. For each id I want the max value for all dataframes but I must also now for which name this max value occured.
The example only shows two dataframes but I really have about a 100 which also contains about a 100 id’s so can’t just take the max and then check which values are contained in one of the dataframes and take the name from there.
I have tried with groupby but can not get it to work.
import pandas as pd
d1 = pd.DataFrame()
d2 = pd.DataFrame()
d1['id'] = [1,2,3,4,5]
d2['id'] = [1,2,3,4,5]
d1['max'] = [10,20,30,40,50]
d2['max'] = 25
d1['name'] = 'name_1'
d2['name'] = 'name_2'
df_max_list=[d1, d2]
df_max_out=pd.concat(df_max_list).groupby(level=0).max()
Got the result:
id max name
0 1 25 name_2
1 2 25 name_2
2 3 30 name_2
3 4 40 name_2
4 5 50 name_2
Expected result:
id max name
0 1 25 name_2
1 2 25 name_2
2 3 30 name_1
3 4 40 name_1
4 5 50 name_1
Regina Phalange is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.