I thought this would be easy, but I am having problems. I have data with a dates column: Start_Date.
I want to copy the data from the smaller pivot table to the larger table df_times when the dates are the same.
df = pd.read_csv(savepath + '01B00015.csv')
df.shape # (416, 10)
df.Start_Date.dtype # returns dtype('O')
df['Start_Date'] = pd.to_datetime(df['Start_Date'])
# sort by date to get the lowest date at the top, newest date at the bottom
# in case it is not in chronological order
df = df.sort_values(by='Start_Date', ascending=True)
# get the start date and the end date
df2 = df[0::len(df)-1 if len(df) > 1 else 1]
start = df2.iloc[0]['Start_Date'] # like Timestamp('2018-04-04 12:00:00')
end = df2.iloc[1]['Start_Date'] # like Timestamp('2024-03-06 00:00:00')
pivot_df = df.pivot_table(index=['Start_Date', 'Site_Description'], columns='Class_Weight', values='Traffic_Count', aggfunc='sum', fill_value=0)
pivot_df = pivot_df.reset_index()
pivotcols = pivot_df.columns
df_times = pd.DataFrame(columns=pivotcols)
# df_times should contain all the days from start to end
df_times['Start_Date'] = pd.date_range(start=start, end=end)
df_times.shape # shows (2163, 4) rows
# 5 * 365 = 1825
# now I want to copy from pivot_df to df_times when Start_Date is the same in both dfs
df_times.loc = pivot_df.loc[pivot_df['Start_Date'] == df_times['Start_Date']]
But I get this error, which I don’t understand as df_times is made up the same as pivot_df, at least, I thought so!
Traceback (most recent call last): File
“/usr/lib/python3.10/idlelib/run.py”, line 578, in runcode
exec(code, self.locals) File “<pyshell#136>”, line 1, in File
“/home/pedro/.local/lib/python3.10/site-packages/pandas/core/ops/common.py”,
line 72, in new_method
return method(self, other) File “/home/pedro/.local/lib/python3.10/site-packages/pandas/core/arraylike.py”,
line 42, in eq
return self._cmp_method(other, operator.eq) File “/home/pedro/.local/lib/python3.10/site-packages/pandas/core/series.py”,
line 6237, in _cmp_method
raise ValueError(“Can only compare identically-labeled Series objects”) ValueError: Can only compare identically-labeled Series
objects
How can I best achieve my goal of inserting the values of pivot_df where the date is the same as in df_times?