I have this big data base that I need to separate into diferent segments (diferent “valid until dates”) and then get each o the segments into a excel workbook. Each of the segments also should have its own excelsheet.
Df Exemple:
A valid until C D
Date
01/08/2023 Tesouro Educa+ 15/12/2030 5,06 5,18
02/08/2023 Tesouro Educa+ 15/12/2030 5,05 5,17
03/08/2023 Tesouro Educa+ 15/12/2030 5,02 5,14
04/08/2023 Tesouro IPCA+ 15/12/2030 5,02 5,14
07/08/2023 Tesouro IPCA+ 15/12/2030 4,98 5,10
...
01/08/2023 Tesouro Selic 15/12/2031 5,05 5,17
02/08/2023 Tesouro Selic 15/12/2031 5,03 5,15
03/08/2023 Tesouro Educa+ 15/12/2031 5,00 5,12
04/08/2023 Tesouro Educa+ 15/12/2031 5,02 5,14
07/08/2023 Tesouro Educa+ 15/12/2031 4,99 5,11
What i tried:
df = pd.read_csv(r'PATH', sep=';')
df.sort_values(by=['valid until','Date'], key=lambda x: np.argsort(natsorted(dados['Date'])))
valid_until_list=dados['valid until'].unique()
for i in valid_until_list:
df2=df[(df['Date']==f'{i}')]
df2=df2.set_index(['Data Base'])
with pd.ExcelWriter(r'NEW_PATH',engine='openpyxl', if_sheet_exists='replace', mode='a') as writer:
i=i.replace('/','_')
# Here i tried changing the name of the worksheets:
sheet_title=''
if df2['A']=='Tesouro IPCA+ com Juros Semestrais':
sheet_title='IPCA+ Juros'
elif nova_df2['A']=='Tesouro IPCA+':
sheet_title='IPCA+'
else:
sheet_title=''
df2.to_excel(writer, sheet_name=f'{sheet_title}_{i}')`
The code did work until i tried to change de worksheet title, now is showing this: ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
I’m here to ask 2 things. My df is kinda big (145712 rows), so de code is taking way to long to run. Is there a faster way to do this, maybe without using for and if loops? The second would be: how do i fix this title Error? It is not necessary, but would make my life soooo much easier :))