I have the following code:
import polars as pl
df = pl.DataFrame({
'name': ['CHECK', 'CASH', 'BAR', 'SET'],
'category1': ['AM', 'EU', 'EU', 'AS'],
'category2': ['CA', 'FR', 'DE', 'CX'],
'quantity': [100, -20, 10, -70],
'exposure': [11, -3, 2, 8]
})
FLT_COLS = ['quantity', 'exposure']
OTHER_COLS = [c for c in df.columns if c not in FLT_COLS]
df_temp = df.select(pl.col(FLT_COLS)).sum()
.with_columns(pl.lit('TOTAL').alias(OTHER_COLS[0]))
.with_columns([pl.lit('').alias(c) for c in OTHER_COLS[1:]])[df.columns]
pl.concat([df, df_temp])
which gives me the desired output
shape: (5, 5)
┌───────┬───────────┬───────────┬──────────┬──────────┐
│ name ┆ category1 ┆ category2 ┆ quantity ┆ exposure │
│ --- ┆ --- ┆ --- ┆ --- ┆ --- │
│ str ┆ str ┆ str ┆ i64 ┆ i64 │
╞═══════╪═══════════╪═══════════╪══════════╪══════════╡
│ CHECK ┆ AM ┆ CA ┆ 100 ┆ 11 │
│ CASH ┆ EU ┆ FR ┆ -20 ┆ -3 │
│ BAR ┆ EU ┆ DE ┆ 10 ┆ 2 │
│ SET ┆ AS ┆ CX ┆ -70 ┆ 8 │
│ TOTAL ┆ ┆ ┆ 20 ┆ 18 │
└───────┴───────────┴───────────┴──────────┴──────────┘
That is, add a row that contains the total sum across a specific list of columns FLT_COLS
, labels the other first column TOTAL
and then puts ""
into the remaining non-summed columns.
Is there a nicer way to add this row? I feel my code looks very clumsy. I also don’t like that I have to specify [df.columns]
to re-order the columns, as this feels very inefficient.