in sql request, i have a chance to sum with lagged param on the fly, like:
CASE
WHEN dt_rep.fcr = 0::numeric THEN 0::numeric
ELSE (dt_rep.fcr - lag(dt_rep.fcr) OVER (PARTITION BY dt_rep.place_id ORDER BY dt_rep.bucket)) / dt_rep.fcr
END AS cr
here, I order the table by bucket (it is timeframe) and sorted them by place_id (to let group by place_id for lagging).
Then, i can simply filtered the table with values of cr is null what gives me required structure.
in pandas, I can shift and assign new column like:
data_fin['fcr_b'] = data_fin.groupby(['place_id'])['fcr'].shift(1)
then I need to combine columns ‘fcr_b’ and ‘fcr’ to new one ‘cr’ and filter dataframe where fcr_b is NaN.
Is there more accurate approache to get new column ‘cr’ something like:
data_fin[cr] = data_fin['fcr'].apply(lambda x: x + lag(x).groupby('place_id'))
<= of cause, it is not working example!