The function I am using to convert my data frame into a nested dictionary strips the column names from the hierarchy, making the dictionary diffictult to naviagte.
I have a large dataframe that looks similar to ths:
exploded_df = pd.DataFrame({
'school_code': [1, 1, 1, 1, 2, 2, 2, 2],
'school_name': ['A', 'A', 'A', 'A', 'B', 'B', 'B', 'B'],
'district_code': [10, 10, 10, 10, 20, 20, 20, 20],
'year': [2022, 2022, 2023, 2023, 2022, 2022, 2023, 2023],
'source': ['S1', 'S2', 'S1', 'S2','S1', 'S2', 'S1', 'S2'],
'enrollment_measure_name': ['M1', 'M2', 'M1', 'M2','M1', 'M2', 'M1', 'M2'],
'value': [100, 150, 120, 170, 100, 150, 90, 100]
})
I have been trying to use the following function and several variations.
`def frame_to_nested_dict(exploded_df, levels):
if not levels:
return exploded_df.groupby('enrollment_measure_name')['value'].apply(list).to_dict()
# Group by the highest level of keys and recursively build the nested dictionary
level = levels[0]
exploded_df = exploded_df.dropna(subset=[level])
grouped = exploded_df.groupby(level)
return {
key: frame_to_nested_dict(group, levels[1:])
for key, group in grouped
}`
with
levels = ['school_code', 'school_name', 'district_code', 'year', 'source']
frame_to_nested_dict(exploded_df, levels)
output:
{1: {'A': {10: {2022: {'S1': {'M1': [100]}, 'S2': {'M2': [150]}}, 2023: {'S1': {'M1': [120]}, 'S2': {'M2': [170]}}}}}, 2: {'B': {20: {2022: {'S1': {'M1': [100]}, 'S2': {'M2': [150]}}, 2023: {'S1': {'M1': [90]}, 'S2': {'M2': [100]}}}}}}
The desired output would be:
{
'school_code': {
1: {
'school_name': {
'A': {
'district_code': {
10: {
'year': {
2022: {
'source': {
'S1': {
'enrollment_measure_name': {
'M1': {'value': [100]}
}
},
'S2': {
'enrollment_measure_name': {
'M2': {'value': [150]}
}
}
}
},
2023: {
'source': {
'S1': {
'enrollment_measure_name': {
'M1': {'value': [120]}
}
},
'S2': {
'enrollment_measure_name': {
'M2': {'value': [170]}
}
}
}
}
}
}
}
}
}
},
2: {
'school_name': {
'B': {
'district_code': {
20: {
'year': {
2022: {
'source': {
'S1': {
'enrollment_measure_name': {
'M1': {'value': [100]}
}
},
'S2': {
'enrollment_measure_name': {
'M2': {'value': [150]}
}
}
}
},
2023: {
'source': {
'S1': {
'enrollment_measure_name': {
'M1': {'value': [90]}
}
},
'S2': {
'enrollment_measure_name': {
'M2': {'value': [100]}
}
}
}
}
}
}
}
}
}
}
}
}
New contributor
nick_craft is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.