I am trying to filter a huge dask.DataFrame (~800k lines and 30 cols). I want to use the dask.query
function of dask.
`
start_date = np.datetime64(start_date, ‘ns’)
end_date = np.datetime64(end_date, ‘ns’)
query_start = f’BeginObservationDate' > {start_date}' query_end = f'
EndObservationDate’ > {end_date}’
dask_df.query(query_start + ‘ and ‘ + query_end).compute()`
But I get this error:
SyntaxError('invalid syntax', ('<unknown>', 1, 8, 'BeginObservationDate >2024 -0 6 -0 7 T00 :00 :00.000000 and EndObservationDate >2025 -0 2 -12 T00 :00 :00.000000))
I don’t know what is wrong with my code maybe dask can’t compare datetime in queries.
Someone had this trouble before or know how can I fix this ?
Thanks,
jotierm is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.