I have the following data-frame (read from a csv file):
my_df:
my_date my_id values key factor
1/1/2024 _One 123 key1 .56
1/7/2024 _One 567 key1 .75
1/14/2024 _One 100 key1 .81
1/14/2024 _One 100 key2 .44
1/1/2024 _Two 150 key3 .91
1/7/2024 _Two 130 key3 .88
So, there is an overlap of certain dates for two or more keys belonging to the same ‘id’. What i want my data-frame to look like is as follows, that is, i need to calculate the allocated values based on the factor weights. Note: the calculated weight is obtained by dividing the factor by the sum of the factors in the overlapping periods. Say,
my_df:
my_date my_id values key factor weights allocated_values
1/1/2024 _One 123 key1 0.56 1 123
1/7/2024 _One 500 key1 0.75 1 500
1/14/2024 _One 100 key1 0.81 0.648 64.8
1/14/2024 _One 100 key2 0.44 0.352 35.2
1/1/2024 _Two 160 key3 0.91 1 160
1/7/2024 _Two 130 key3 0.88 1 130
To achieve the above result, i am doing the following group by:
for name, group in my_df.groupby('my_id'):
for name1, group1 in group.groupby('key'):
factors = group1['factor']
weight = factors['factor']/factors.sum()
#what i tried- approach1
group['weights'] = weight #doesn't work
#what i tried next
my_df['weights'] = my_df.update(group) #doesn't work
I am so tired now, unable to think any further. So posting it here for any help/guidance.
Would much appreciate any hints.
Thanks!