Relative Content

Tag Archive for python-polars

Polars memory usage just building up and not coming down

I have an API which does alot of computation and write the output to database and also return output. This was largely in pandas, I am trying to use polars for this. A very basic code to mimic this API is below. I have this service running and the memory usage is just going up and up. I suspect soon the workers will restart running out of memory and this will cause a downtime. Past threads have conveyed that the memory allocator keeps the memory during the duration of process but as this is running as supervisor worker the process is always on unless restarted manually or in other situation like SIGTERM/SIGKILL etc..
How to solve this? One of the option that is mentione is that worker process dies once it has process, not sure how to do that. Also, not sure if that is ideal as new worker coming up has its own overhead?

Rolling KPI calculations in polars, index not visible

How to add rolling KPI’s to original dataframe in polars? when I do group by, I am not seeing an index and so cant join? I want to keep all original columns in dataframe intact but add rolling kpi to the dataframe?

Best way to aggregate an iterable of `polars.DataFrame` or `polars.Series` objects

I am looking for the best way to compute a per-row running sum (average) over a large number of polars.DataFrames, where each of the frames can potentially have a large number of rows. I’d like the implementation to be efficient (fast) but I want to keep the memory footprint in check, e.g. never assemble all the frames in memory before doing the aggregation.