I’m using pyarrow.csv
to read and convert a CSV file to parquet. This CSV file has a timestamp
column with an int representing Unix time.
Nevertheless, it reads it as an int64, and if I try to use convertoptions
, it raises an error:
import pyarrow as pa
import pyarrow.csv as pv
table = pv.read_csv("file.csv", convert_options=pv.ConvertOptions(
column_types={
'timestamp': pa.timestamp('s'),
}
))
This raises the following error:
ArrowInvalid: In CSV column #1: CSV conversion error to timestamp[s]: invalid value '1705173443'