I have pandas DataFrame which I iterate over to get values from two columns (col_1
and col_2
). Second column contains lists. What I need to do, is for every value in col_1
get list of values in col_2
and insert value of col_1
into list of col_2
values.
This is how I did it:
import pandas as pd
dict_for_df = {
"col_1": ["A", "B", "C"],
"col_2": [["D", "E"], ["F", "H"], ["I", "J"]],
}
df = pd.DataFrame(dict_for_df)
for i in range(df.shape[0]):
col_1_value = df["col_1"][i]
col_2_list = df["col_2"][i]
col_2_list.insert(0, col_1_value)
However, insert
operation on col_2_list
also modifies original df
. How to avoid it?
col_1 col_2
0 A [A, D, E]
1 B [B, F, H]
2 C [C, I, J]
A possible solution, which uses:
-
assign
to create a new columncol_1
in the dataframedf
, where each element is converted to a list using themap
function. -
sum
to concatenate the lists in the columns along the rows (axis=1), resulting in a new columncol_2
.
df['col_2'] = df.assign(col_1 = df['col_1'].map(list)).sum(axis=1)
Output:
col_1 col_2
0 A [A, D, E]
1 B [B, F, H]
2 C [C, I, J]
2