I have a long code that at some point has a very small dataframe with 813 rows and 16 columns.
To this dataframe i apply the groupby function
fm = fm.groupby(['Tower_ID' ,'Cell_ID' ,'Alarm ID' ,'Severity' ,'Alarm Type' ,'Alarm Text' ,'Supplementary Info' ,'PERIOD_START_TIME' #,'File_Name' ]
).agg({'Full_Start_Date': 'min'
,'Full_End_Date': 'max'
,'Alarm hold time (sec)': 'sum'
,'End_Date': 'max'
,'Cross_Over': 'max'
,'Cross_Over_diff': 'max'
}
)
This results in an error of out of memory
numpy.core._exceptions._ArrayMemoryError: Unable to allocate 11.1 GiB for an array with shape (1485993600,) and data type int64
Things that i have tried.
1 – Instead of fm = fm i used different variable like bananas = fm.groupby ….. same result
2 – Tried changing the formats of the columns to the exact type i need, category, int, etc…. same result
What worked
Before the groupby i save the fm dataframe into a file and then read the file back into fm
fm.to_excel('C:\home\fm_data.xlsx')
fm = pd.read_excel('C:\home\fm_data.xlsx')
And this works!!!
Can anyone have an ideia of why?
This is a very very poor solution and i would want to understand what can be the problem.
I appreciate the help.