Relative Content

Tag Archive for parquetpython-polars

Writing partitioned parquet files using polars without overwriting existing files

I have some streaming coming in as JSON that I transform into a polars dataframe and then I write out the data as parquet partitioned by two columns. I am noticing that if a new record has the same partition then instead of writing an additional file in the folder it is just overwriting the old data with the new data. I want to keep the old data and write new data to the partition folder.

Writing partitioned parquet files using polars without overwriting existing files

I have some streaming coming in as JSON that I transform into a polars dataframe and then I write out the data as parquet partitioned by two columns. I am noticing that if a new record has the same partition then instead of writing an additional file in the folder it is just overwriting the old data with the new data. I want to keep the old data and write new data to the partition folder.