For example:
import polars as pl
dates = pl.DataFrame(pl.Series("date", ["2024,2", "2024,12"]))
dates = dates.with_columns(pl.col("date").str.to_date("%Y,%m"))
This does not work because of the missing day. Python’s strptime
adds a day of 1:
from time import strptime
strptime("2024,2", "%Y,%m")
Is there any way to do something similar with polars
, apart from just attaching to the strings a day?
import polars as pl
dates = pl.DataFrame(pl.Series("date", ["2024,2", "2024,12"]))
dates = dates.with_columns((pl.col("date") + ",1").str.to_date(r"%Y,%m,%d"))
Which is an okay solution, but it does unnecessary work. So the best solution I have come up with in terms of not doing repetitive work is:
import polars as pl
dates = pl.DataFrame(pl.Series("date", ["2024,2", "2024,12"]))
dates = (
dates.with_columns(
pl.col("date").str.split(",").list.to_struct(fields=["year", "month"])
)
.unnest("date")
.with_columns(date=pl.date(pl.col("year"), pl.col("month"), 1))
)