I have a pandas dataframe like this:
1960-09-01 24027064 4503904.333
1960-10-01 18020298 3377928.25
1960-11-01 12013532 2251952.167
1960-12-01 6006766 1125976.083
1961-01-01 0 0
1961-02-01 0 0
1961-03-01 0 0
1961-04-01 0 0
1961-05-01 0 0
1961-06-01 0 0
1961-07-01 0 0
1961-08-01 0 0
1961-09-01 0 0
1961-10-01 0 0
1961-11-01 0 0
1961-12-01 0 0
1969-01-01 0 0
1969-02-01 6173432.667 1150976.083
1969-03-01 12346865.33 2301952.167
1969-04-01 18520298 3452928.25
I am trying to smoothen the data using somthing like this:
from scipy.signal import savgol_filter
df = df.apply(savgol_filter, window_length=df.shape[0] // 5, polyorder=2)
I am getting something like this:
1960-09-01 25874679.88 4850242.328
1960-10-01 24574614.17 4606543.324
1960-11-01 23301520.97 4367900.35
1960-12-01 22055400.26 4134313.405
1961-01-01 20836252.04 3905782.489
1961-02-01 19644076.31 3682307.603
1961-03-01 18478873.09 3463888.745
1961-04-01 17340642.35 3250525.916
1961-05-01 16229384.11 3042219.117
1961-06-01 15145098.37 2838968.347
1961-07-01 14087785.12 2640773.605
1961-08-01 13057444.36 2447634.893
1961-09-01 12054076.1 2259552.21
1961-10-01 11077680.33 2076525.557
1961-11-01 10128257.06 1898554.932
1961-12-01 9205806.28 1725640.336
1969-01-01 19071395.37 5088684.927
1969-02-01 19448311.7 5412158.942
1969-03-01 19790962.91 5737508.936
1969-04-01 20099349 6080752.937
But I dont want it that way, I want to keep zeros as zeros and also while computing smoothening function I want to ignore the zeros and apply smoothing function to the rest elements in the sliding window. For example if my window_length
is 5 and I have [0, 0, 6173432.667, 12346865.33, 18520298]
as current window content, I want the smoothening function to drop the zeros and convert the window to size of 3 ie [6173432.667,12346865.33,18520298]
and calculate accordingly. In other words, essentially I want to apply piece-wise curve fitting for each of the non-zero stretch as smoothening function.
1
When you apply the convolve1d function (it is inside savgol_filter, https://github.com/scipy/scipy/blob/v1.14.1/scipy/signal/_savitzky_golay.py#L351) it already works like you have described: zero values adds nothing to result of convolution.
You can just save a mask with current positions of zeros and make same places in final result zeros again.
zero_mask = values == 0
values = values.apply(
savgol_filter,
window_length=values.shape[0] // 5,
polyorder=2
)
values[zero_mask] = 0
Or, you can apply savgol_filter directly to non-zero values, if you want the result looks like there are no zeros at all:
values.loc[~zero_mask] = values.loc[~zero_mask].apply(
savgol_filter,
window_length=values.shape[0] // 5,
polyorder=2
)