suppose I have two dataframes:
<code>import pandas as pd
df1 = pd.DataFrame({'Date': [0, 2, 3], 'Id':['A','B','C'], 'Country': ['US', 'CA', 'DE']})
df2 = pd.DataFrame({'Date': [0, 1, 2, 3], 'US': [10, 20, 30, 40], 'CA':[5, 10, 15, 20], 'DE':[100, 200, 300, 400]})
</code>
<code>import pandas as pd
df1 = pd.DataFrame({'Date': [0, 2, 3], 'Id':['A','B','C'], 'Country': ['US', 'CA', 'DE']})
df2 = pd.DataFrame({'Date': [0, 1, 2, 3], 'US': [10, 20, 30, 40], 'CA':[5, 10, 15, 20], 'DE':[100, 200, 300, 400]})
</code>
import pandas as pd
df1 = pd.DataFrame({'Date': [0, 2, 3], 'Id':['A','B','C'], 'Country': ['US', 'CA', 'DE']})
df2 = pd.DataFrame({'Date': [0, 1, 2, 3], 'US': [10, 20, 30, 40], 'CA':[5, 10, 15, 20], 'DE':[100, 200, 300, 400]})
I would like to add a column to df1 that is equal to the value in df2 for the corresponding Country and corresponding date.
Desired result should look like that:
<code>df1 = pd.DataFrame({'Date': [0, 2, 3], 'Id':['A','B','C'], 'Country': ['US', 'CA', 'DE'], 'New':[10, 15, 400]})
</code>
<code>df1 = pd.DataFrame({'Date': [0, 2, 3], 'Id':['A','B','C'], 'Country': ['US', 'CA', 'DE'], 'New':[10, 15, 400]})
</code>
df1 = pd.DataFrame({'Date': [0, 2, 3], 'Id':['A','B','C'], 'Country': ['US', 'CA', 'DE'], 'New':[10, 15, 400]})
melt
and merge
:
<code>out = df1.merge(df2.melt('Date', var_name='Country', value_name='New'),
on=['Date', 'Country'], how='left')
</code>
<code>out = df1.merge(df2.melt('Date', var_name='Country', value_name='New'),
on=['Date', 'Country'], how='left')
</code>
out = df1.merge(df2.melt('Date', var_name='Country', value_name='New'),
on=['Date', 'Country'], how='left')
Or two-DataFrames indexing lookup:
<code>idx, cols = pd.factorize(df1['Country'])
df1['New'] = (df2.set_index('Date')
.reindex(index=df1['Date'], columns=cols)
.to_numpy()[np.arange(len(df1)), idx]
)
</code>
<code>idx, cols = pd.factorize(df1['Country'])
df1['New'] = (df2.set_index('Date')
.reindex(index=df1['Date'], columns=cols)
.to_numpy()[np.arange(len(df1)), idx]
)
</code>
idx, cols = pd.factorize(df1['Country'])
df1['New'] = (df2.set_index('Date')
.reindex(index=df1['Date'], columns=cols)
.to_numpy()[np.arange(len(df1)), idx]
)
Output:
<code> Date Id Country New
0 0 A US 10
1 2 B CA 15
2 3 C DE 400
</code>
<code> Date Id Country New
0 0 A US 10
1 2 B CA 15
2 3 C DE 400
</code>
Date Id Country New
0 0 A US 10
1 2 B CA 15
2 3 C DE 400
1
<code>import pandas as pd
# Sample data
df1 = pd.DataFrame({'day': [0, 2, 3], 'Id':['A','B','C'], 'country': ['US', 'CA', 'DE']})
df2 = pd.DataFrame({'day': [0, 1, 2, 3], 'US': [10, 20, 30, 40], 'CA':[5, 10, 15, 20], 'DE':[100, 200, 300, 400]})
# now the day column as index df2
df2.set_index('day', inplace=True)
# Add df1
df1['Value'] = df1.apply(lambda row: df2.at[row['day'], row['country']], axis=1)
print(df1)
</code>
<code>import pandas as pd
# Sample data
df1 = pd.DataFrame({'day': [0, 2, 3], 'Id':['A','B','C'], 'country': ['US', 'CA', 'DE']})
df2 = pd.DataFrame({'day': [0, 1, 2, 3], 'US': [10, 20, 30, 40], 'CA':[5, 10, 15, 20], 'DE':[100, 200, 300, 400]})
# now the day column as index df2
df2.set_index('day', inplace=True)
# Add df1
df1['Value'] = df1.apply(lambda row: df2.at[row['day'], row['country']], axis=1)
print(df1)
</code>
import pandas as pd
# Sample data
df1 = pd.DataFrame({'day': [0, 2, 3], 'Id':['A','B','C'], 'country': ['US', 'CA', 'DE']})
df2 = pd.DataFrame({'day': [0, 1, 2, 3], 'US': [10, 20, 30, 40], 'CA':[5, 10, 15, 20], 'DE':[100, 200, 300, 400]})
# now the day column as index df2
df2.set_index('day', inplace=True)
# Add df1
df1['Value'] = df1.apply(lambda row: df2.at[row['day'], row['country']], axis=1)
print(df1)
New contributor
joana is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
6