I’m reading a parquet file from S3 bucket using polars and below is the code that i use:
df = pl.read_parquet(parquet_file_name, storage_options=storage_options, hive_partitioning=False)
In the S3 bucket, the value (which is invalid as the year is 0200) of the date column is stored as
start_date = 0200-03-01 00:00:00
After reading this value from S3 bucket using polars.read_parquet method, it internally converts the date as
start_date = 1953-10-28 10:43:41.128654848
and it sets the datatype in the polars dataframe as Datetime(time_unit=’ns’, time_zone=None) for that column.
Is there any way that i can retain the date as is even if it is invalid? I tried doing cast but it doesn’t help because the read_parquet method internally reads it as 1953-10-28 and doing any cast on top of already converted value doesn’t help.
Any help would be highly appreciated please.