I have several measured values, each in a time series.
Time steps are unevenly distributed and also uneven between the data series.
I’m trying to interpolate each series into a fixed 10 second time step using Python and Pandas but interpolated values are only NaN.
This is a data snipped:
Timestamp,Value
2023-05-20T22:00:04.023Z,102
2023-05-20T22:00:14.033Z,100
2023-05-20T22:00:24.074Z,99
2023-05-20T22:00:35.484Z,99
2023-05-20T22:00:44.029Z,102
2023-05-20T22:00:54.054Z,100
2023-05-20T22:01:04.026Z,99
2023-05-20T22:01:14.029Z,103
2023-05-20T22:01:24.054Z,99
2023-05-20T22:01:34.022Z,98
2023-05-20T22:01:44.026Z,99
2023-05-20T22:01:54.062Z,100
2023-05-20T22:02:04.025Z,125
And this is the python script.
I’d say everything works like required, until the interpolate method.
import pandas as pd
Power_curr = pd.read_csv("pathtodata.csv",parse_dates=['Timestamp'])
# Convert the 'Timestamp' column to datetime format
Power_curr['Timestamp'] = pd.to_datetime(Power_curr['Timestamp'])
# timestamp is index
Power_curr.set_index('Timestamp', inplace=True)
# Find start end end, round to 10 s
start_time = Power_curr.index.min().ceil('10s')
end_time = Power_curr.index.max().floor('10s')
# Create new timestamp series
new_time_index = pd.date_range(start=start_time, end=end_time, freq='10s')
# Create new data frame and interpolate into new timestamps
Power_curr_interpolated = Power_curr.reindex(new_time_index).interpolate(method='time')
I believe that the result of Power_curr.reindex(new_time_index) is the problem.
It returns only NaN values, so there is no chance to interpolate these values. But why NaN?
Background information: Why this question?
One time series is the electricity consumption of a house, which is recorded on the electricity meter.
Another time series is the electricity generation of a PV system.
Both were recorded for over a year.
With this information, I would now like to calculate the degree of self-sufficiency that could have been achieved with a battery storage system.
What has been done so far?
-
I’ve tried different interpolation settings like ‘linear’ and ‘nearest’
-
I checked, that new_time_index has the right type with
Power_curr['Timestamp'] = pd.to_datetime(Power_curr['Timestamp'])
-
I’ve checked the installed pandas version: v2.2.2
-
I tried
Power_curr_interpolated = Power_curr['Timestep'].resample('10s').mean()
instead ofPower_curr.reindex(new_time_index).interpolate(method='time')
. At least no NaN, but result looks wrong, as values do not change with changed timestamp:
pycharm watch -
I’ve checked other posts realted to this issue like this: with no solution
-
Also checked this: looked promising
and got the believe, that the result ofPower_curr.reindex(new_time_index)
is the problem. It returns only NaN values, so there is no chance to interpolate anything. But why?