I’m trying to recreate a pivot table I made in Excel in python using pandas library. There’s over 500k lines of OD total trips data that I’m trying to summarize with time period as the filter. On excel I would just do row(O), column(D), values(trips), filter(time). So far on python I just have index, columns, values, aggfunc, but I don’t know how to filter. There’s 12 time periods and I only want to include 3.
I tried using O/D both as index, then time as column. Is there a way to remove columns from the pivot table?
Here is what I have that makes a full pivot table including all times:
- import pandas as pd
df = pd.read_excel(‘…xlsx’)
print(df.pivot_table(index=[‘O’, “D”], columns=[‘Time’], values=[‘Trips’], aggfunc=’sum’))
I have also tried:
-
print(df[(df.Time == ‘7am’)].pivot_table(index=[‘O’, “D”], columns=[‘Time’], values=’Trips’, aggfunc=’sum’)) This works for me, but I’m trying to include 3 different hours so I tried
-
print(df[(df[‘Time’] == ‘7am’) & (df[‘Time’] == ‘4pm’) & (df[‘Time’] == ‘All Day’)].pivot_table(index=[‘O’, “D”], columns=[‘Time’], values=’Trips’, aggfunc=’sum’)) But this didn’t work
I’m very new to python, and admittedly, don’t understand much. So please let me know if I need to become more knowledgeable in some aspect of python to figure this out (or if it’s incredibly basic). Or any resources that could point me in the right direction. Thanks!
EXiao is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.