Newbie to python. I have a time series dataset that always has time in the first column, and multiple columns of data (This can be from 3 columns plus time upto 100 columns with 500k rows). This will need to process for 25+ individual files so it would need to be in a for loop. Time header will always remain the same, but the rest of the header names can change. For this particular dataset, I want to find the max value of each column vs time. Then want to find where the change of slope occurred prior to max value, and use that as another value vs time. From there, want to calculate the slope of each slope to output to CSV format. The other issue is that every other column, values are typically negative value. I am running in Pandas.
Run of the script would be ideal for efficiency, but not a priority.
Any Help would be appreciated!
#DropHeaders Run and Time1, filter raw strain, insert Filename
for file in source_dir7.glob('*.csv'):
df_header_1 = pd.read_csv((file),header=10,encoding=('UTF-8'),skiprows=.
[11,12],low_memory=False)
df_header_2 = df_header_1 .drop(columns=['Run','Time.1'])
df_header_3 = df_header_2 .loc [:, df_header_2.columns.str.contains('Time|TRNV|RP')]
df_header_3.insert(0, "Filename", file.name)
print(df_header_3)
df_header_3 .to_csv(output_dir7.joinpath(file.name), index=False)
for file in source_dir2.glob('*.csv'):
#create Max/Min file
df_header_6 = pd.read_csv((file), header=0, low_memory=False)
df_header_7 = df_header_6 .agg(['max', 'min']
df_header_8 = df_header_7.groupby(df_header_7['Filename'],as_index=False).agg ([max,min])
df_header_9 = df_header_8.drop(df_header_8.iloc[:0:1].index)
print(df_header_8)
df_header_9 .to_csv(output_dir2.joinpath(file.name), index=False)
This is where I am lost on how to add the time reference for the max value to be Max/Min X,Y for each column of the max min file, and the remaining steps for finding the start of the slope change in each direction to give my start X,Y coordinates for each column. From there it should be a reference of Max/Min X,Y vs Start X,Y to calculate the slope and place into a CSV
Attached a screenshot of how my dataset looks after stripping out the info I do not need, and split showing how the dataset looks to reference the max and min columns. Also added in what my dataset looks like plotted within matplotlib
enter image description hereenter image description here
Fergs is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
1