I am working on a small program and need some guidance.
Basically I am trying to read a CSV, put the attributes into a data frame and filter where “video = 1”. This has been done.
What I cant figure out to do is splitting this data frame into multiple data frames. In the image below, I am looking to split it into two data frames when either the “count” is greater than 1, or the “Time” is greater than 100. Basically these are different recordings in the same log I need to split apart for analysis.
It might not happen in each log, it could happen 2+ times.
I am thinking of a For Loop to determine if the previous row is greater than the values I mentioned above, but I cant figure out making it work.
Any help is appreciated.
2
With your clarification that is the CHANGE in count
or Time
that you want to mark the separation then there are two issues: first to mark the separate parts of the DF and the split the parts using groupby
; secondly to store the separated DFs. The first can be done using a condition and cumsum
and the second by storing the separate DF’s in a List.
import pandas as pd
df = pd.DataFrame ({ 'count': [3, 4, 5, 6, 7, 15, 16, 17, 18, 19, 20],
'Time': [400, 500, 600, 700, 800, 1600, 1700, 1800, 1900, 2000, 2100]
})
df['g'] = ((df['count'].diff().gt(1)) | (df['Time'].diff().gt(100))).cumsum()
grouped = df.groupby('g') #group the rows based on temp column 'g'
df_list = []
for g, dfg in grouped:
df_list.append(dfg.drop('g', axis = 1).reset_index(drop = True))
The individual DFs can then be accessed using df[0] etc. and could be renamed as required.
print(df_list[0],'n')
print(df_list[1])
which gives:
count Time
0 3 400
1 4 500
2 5 600
3 6 700
4 7 800
count Time
0 15 1600
1 16 1700
2 17 1800
3 18 1900
4 19 2000
5 20 2100
2