<code>import polars as pl
import numpy as np
df_sim = pl.DataFrame({
"daily_n": [1000, 2000, 3000, 4000],
"prob": [.5, .5, .5, .6],
"size": 1
})
df_sim = df_sim.with_columns(
pl.struct(["daily_n", "prob", "size"])
.map_elements(lambda x:
np.random.binomial(n=x['daily_n'], p=x['prob'], size=x['size']))
.cast(pl.Int32)
.alias('events')
)
df_sim
</code>
<code>import polars as pl
import numpy as np
df_sim = pl.DataFrame({
"daily_n": [1000, 2000, 3000, 4000],
"prob": [.5, .5, .5, .6],
"size": 1
})
df_sim = df_sim.with_columns(
pl.struct(["daily_n", "prob", "size"])
.map_elements(lambda x:
np.random.binomial(n=x['daily_n'], p=x['prob'], size=x['size']))
.cast(pl.Int32)
.alias('events')
)
df_sim
</code>
import polars as pl
import numpy as np
df_sim = pl.DataFrame({
"daily_n": [1000, 2000, 3000, 4000],
"prob": [.5, .5, .5, .6],
"size": 1
})
df_sim = df_sim.with_columns(
pl.struct(["daily_n", "prob", "size"])
.map_elements(lambda x:
np.random.binomial(n=x['daily_n'], p=x['prob'], size=x['size']))
.cast(pl.Int32)
.alias('events')
)
df_sim
However the following code would fail with the message
“TypeError: float() argument must be a string or a number, not ‘Expr'”
<code>df_sim.with_columns(
np.random.binomial(n=col('daily_n'), p=col('prob'), size=col('size'))
.alias('events')
)
</code>
<code>df_sim.with_columns(
np.random.binomial(n=col('daily_n'), p=col('prob'), size=col('size'))
.alias('events')
)
</code>
df_sim.with_columns(
np.random.binomial(n=col('daily_n'), p=col('prob'), size=col('size'))
.alias('events')
)
Why do some functions require use of struct()
, map_elements()
and lambda
, while others do not?
In my case below I am able to simply refer to polars columns as function arguments by using col()
.
<code>def local_double(x):
return(2*x)
df_ab.with_columns(rev_2x = local_double(col("revenue")))
</code>
<code>def local_double(x):
return(2*x)
df_ab.with_columns(rev_2x = local_double(col("revenue")))
</code>
def local_double(x):
return(2*x)
df_ab.with_columns(rev_2x = local_double(col("revenue")))
2