I want to write a polars when-then-otherwise expression with the following logic:
IF a1 and a2 are not null AND b1 and b2 are null
THEN b1=a1, b2=a2
ELSE keep values of b1 and b2
As an example I have this dataframe
df = pl.DataFrame({
'a1': [1,1,1],
'a2': [2,2,None],
'b1': [3,None,None],
'b2': [4,None,None]
})
┌─────┬──────┬──────┬──────┐
│ a1 ┆ a2 ┆ b1 ┆ b2 │
│ --- ┆ --- ┆ --- ┆ --- │
│ i64 ┆ i64 ┆ i64 ┆ i64 │
╞═════╪══════╪══════╪══════╡
│ 1 ┆ 2 ┆ 3 ┆ 4 │
│ 1 ┆ 2 ┆ null ┆ null │
│ 1 ┆ null ┆ null ┆ null │
└─────┴──────┴──────┴──────┘
I want this result
┌─────┬──────┬──────┬──────┐
│ a1 ┆ a2 ┆ b1 ┆ b2 │
│ --- ┆ --- ┆ --- ┆ --- │
│ i64 ┆ i64 ┆ i64 ┆ i64 │
╞═════╪══════╪══════╪══════╡
│ 1 ┆ 2 ┆ 3 ┆ 4 │
│ 1 ┆ 2 ┆ 1 ┆ 2 │
│ 1 ┆ null ┆ null ┆ null │
└─────┴──────┴──────┴──────┘
I tried the following
df.with_columns(pl.when(pl.col('a1').is_not_null() &
pl.col('a2').is_not_null() &
pl.col('b1').is_null() &
pl.col('b2').is_null()
)
.then(pl.col('a1','a2'))
.otherwise(pl.col('b1','b2'))
.name.map(lambda x: {'a1':'b1','a2':'b2'}.get(x))
)
but this raises an error
ComputeError: 'with_columns' failed
The reason: expanding more than one `col` is not allowed:
New contributor
gogodigi is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
You can’t have multiple columns in the then/otherwise.
You either have to repeat the when/then/otherwise
for each column of output or you can wrap them in a struct, drop the old columns, and unnest the struct.
(
df
.with_columns(pl.when(pl.col('a1').is_not_null() &
pl.col('a2').is_not_null() &
pl.col('b1').is_null() &
pl.col('b2').is_null()
)
.then(pl.struct(b1='a1',b2='a2').alias('z'))
.otherwise(pl.struct('b1','b2').alias('z'))
)
.drop('b1','b2')
.unnest('z')
)
shape: (3, 4)
┌─────┬──────┬──────┬──────┐
│ a1 ┆ a2 ┆ b1 ┆ b2 │
│ --- ┆ --- ┆ --- ┆ --- │
│ i64 ┆ i64 ┆ i64 ┆ i64 │
╞═════╪══════╪══════╪══════╡
│ 1 ┆ 2 ┆ 3 ┆ 4 │
│ 1 ┆ 2 ┆ 1 ┆ 2 │
│ 1 ┆ null ┆ null ┆ null │
└─────┴──────┴──────┴──────┘