I noticed a few differences in the JSON serialization of a Polars Expression written in Python and serialized to JSON in different versions of Polars (0.20.21
, 0.20.22
and 0.20.23
), and I was wondering if this is really intended, since it was a PATCH version upgrade (semver) and the JSON serialized in 0.20.22
breaks when I tried to deserialize it in 0.20.23
(and vice versa).
Below is the Python code used to serialize the expression in different Polars versions:
import polars as pl
print(pl.__version__)
handle_nulls = pl.any_horizontal(pl.all().is_null())
expr = (
pl
.when(handle_nulls)
.then(pl.lit(-1))
.otherwise(
pl.lit(100)
+ pl.col("input_column_1").fill_null(1)
)
).alias("output_column")
print(expr.meta.serialize())
And here are the resulting JSON serializations:
0.20.21
{"Alias":[{"Ternary":{"predicate":{"Function":{"input":[{"Function":{"input":["Wildcard"],"function":{"Boolean":"IsNull"},"options":{"collect_groups":"ElementWise","fmt_str":"","input_wildcard_expansion":false,"returns_scalar":false,"cast_to_supertypes":false,"allow_rename":false,"pass_name_to_apply":false,"changes_length":false,"check_lengths":true,"allow_group_aware":true}}}],"function":{"Boolean":"AnyHorizontal"},"options":{"collect_groups":"ElementWise","fmt_str":"","input_wildcard_expansion":true,"returns_scalar":false,"cast_to_supertypes":false,"allow_rename":true,"pass_name_to_apply":false,"changes_length":false,"check_lengths":true,"allow_group_aware":true}}},"truthy":{"Literal":{"Int32":-1}},"falsy":{"BinaryExpr":{"left":{"Literal":{"Int32":100}},"op":"Plus","right":{"Function":{"input":[{"Column":"input_column_1"},{"Literal":{"Int32":1}}],"function":{"FillNull":{"super_type":"Unknown"}},"options":{"collect_groups":"ElementWise","fmt_str":"","input_wildcard_expansion":false,"returns_scalar":false,"cast_to_supertypes":true,"allow_rename":false,"pass_name_to_apply":false,"changes_length":false,"check_lengths":true,"allow_group_aware":true}}}}}}},"output_column"]}
0.20.22
{"Alias":[{"Ternary":{"predicate":{"Function":{"input":[{"Function":{"input":["Wildcard"],"function":{"Boolean":"IsNull"},"options":{"collect_groups":"ElementWise","fmt_str":"","input_wildcard_expansion":false,"returns_scalar":false,"cast_to_supertypes":false,"allow_rename":false,"pass_name_to_apply":false,"changes_length":false,"check_lengths":true,"allow_group_aware":true}}}],"function":{"Boolean":"AnyHorizontal"},"options":{"collect_groups":"GroupWise","fmt_str":"","input_wildcard_expansion":true,"returns_scalar":false,"cast_to_supertypes":false,"allow_rename":false,"pass_name_to_apply":false,"changes_length":false,"check_lengths":true,"allow_group_aware":true}}},"truthy":{"Literal":{"Int32":-1}},"falsy":{"BinaryExpr":{"left":{"Literal":{"Int32":100}},"op":"Plus","right":{"Function":{"input":[{"Column":"input_column_1"},{"Literal":{"Int32":1}}],"function":{"FillNull":{"super_type":"Unknown"}},"options":{"collect_groups":"ElementWise","fmt_str":"","input_wildcard_expansion":false,"returns_scalar":false,"cast_to_supertypes":true,"allow_rename":false,"pass_name_to_apply":false,"changes_length":false,"check_lengths":true,"allow_group_aware":true}}}}}}},"output_column"]}
0.20.23
{"Alias":[{"Ternary":{"predicate":{"Function":{"input":[{"Function":{"input":["Wildcard"],"function":{"Boolean":"IsNull"},"options":{"collect_groups":"ElementWise","fmt_str":"","input_wildcard_expansion":false,"returns_scalar":false,"cast_to_supertypes":false,"allow_rename":false,"pass_name_to_apply":false,"changes_length":false,"check_lengths":true,"allow_group_aware":true}}}],"function":{"Boolean":"AnyHorizontal"},"options":{"collect_groups":"GroupWise","fmt_str":"","input_wildcard_expansion":true,"returns_scalar":false,"cast_to_supertypes":false,"allow_rename":false,"pass_name_to_apply":false,"changes_length":false,"check_lengths":true,"allow_group_aware":true}}},"truthy":{"Literal":{"Int":-1}},"falsy":{"BinaryExpr":{"left":{"Literal":{"Int":100}},"op":"Plus","right":{"Function":{"input":[{"Column":"input_column_1"},{"Literal":{"Int":1}}],"function":"FillNull","options":{"collect_groups":"ElementWise","fmt_str":"","input_wildcard_expansion":false,"returns_scalar":false,"cast_to_supertypes":true,"allow_rename":false,"pass_name_to_apply":false,"changes_length":false,"check_lengths":true,"allow_group_aware":true}}}}}}},"output_column"]}
The differences between the JSONs from 0.20.21 and 0.20.22 don’t seem to break the deserialization in each other’s versions, as the expression serialized in 0.20.21
could still be deserialized in 0.20.22
and vice versa. However, I was wondering if it would still be safe to use them despite these changes.
Now, between the JSONs from 0.20.22 and 0.20.23 there were more “critical” differences, since the deserialization broke in each other’s versions with the following exception:
polars.exceptions.ComputeError: could not deserialize input into an expression
Given this scenario, I’d also like to know if Polars Expressions serialized to JSON in Python Polars are safe to be deserialized and used in Rust Polars in a production environment. If so, is there a compatibility matrix between the Python and Rust releases, as they seem to be versioned differently (as of 2024-06-07, the latest version of Polars in PyPI is 0.20.31
while the latest version for Rust is 0.40.0
)?
Thanks in advance!
Hideaki Kito is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.