i have a dataframe with the following schema:
root
|-- key: string (nullable = true)
|-- points: array (nullable = true)
| |-- element: struct (containsNull = true)
| | |-- time: long (nullable = true)
| | |-- latitude: double (nullable = true)
| | |-- longitude: double (nullable = true)
| | |-- altitude: float (nullable = true)
i add an altitude_list column that is a string concatenation of the altitude field
df.withColumn("altitude_list", stringify_altitudes("points")).drop("points").show()
i’m using this funtion to stringify
def stringify_litetrack_points(points):
return F.expr("array_join(transform(points, x -> concat(round(x.altitude))), ':')")
this works find. however, sometimes the altitude is None (null) and in these cases, the function does nothing with that point array entry, meaning that if i have five points andone of the points has a None altitude, the resulting altitude_list will only have four items. i’d to insert a literal for None values.
if tried to use a when().otherwise() as follows…
def stringify_litetrack_points(points):
return F.expr("array_join(transform(points, x -> concat(when(x.altitude.isNotNull(), round(x.altitude)).otherwise('*'))), ':')")
…but i get a syntax error at the otherwise().
ParseException:
Syntax error at or near '('(line 1, pos 98)
== SQL ==
array_join(transform(points, x -> concat(when(x.altitude.isNotNull(), round(x.altitude)).otherwise('*'))), ':')
--------------------------------------------------------------------------------------------------^^^
what am i doing wrong? should i be doing this a different way? thanks.