Load multiple large CSV files into parquet while creating new colum for file name
I have collections of CSV files, up to 1000, each being ~1Gb uncompressed. I want to create a single parquet dataset from them.
ValueError: Appended dtypes differ when appending two simple tables with dask
I am using Dask to write multiple very large dataframes to a single parquet dataset in python.