I’m trying to convert a str
column from str
to float
(it’s intended to be a float). However, the string have commas and dots and I’m not being able to correctly replace the values:
import polars as pl
df = pl.DataFrame({"numbers": ["1.004,00", "2.005,00", "3.006,00"]})
df = df.with_column(
df["numbers"].str.replace(".", "").str.replace(",", ".").cast(pl.Float64)
)
print(df)
I’m getting:
ComputeError: conversion from str to f64 failed in column ‘numbers’
for 3 out of 3 values: [“.004.00”, “.005.00”, “.006.00”]
I Also tried just removing the “.” with nothing using:
df = df.with_columns(df["numbers"].str.replace(".", ""))
print(df)
But I’m getting the values without the first number.
user24900119 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
1
str.replace
uses regular expressions, so the dot matches the first character. Just escape it:
import polars as pl
df = pl.DataFrame({ "numbers": ["1.004,00", "2.005,00", "3.006,00"] })
df = (
df
.with_columns(
pl.col('numbers').str.replace(".", "").str.replace(",", ".").cast(pl.Float64)
)
)
print(df)
Output:
shape: (3, 1)
┌─────────┐
│ numbers │
│ --- │
│ f64 │
╞═════════╡
│ 1004.0 │
│ 2005.0 │
│ 3006.0 │
└─────────┘