I have a Delta Live Table (DLT) with a string column that contains a serialized JSON object. I’m trying to create another DLT that queries from this table with some filtering, parses the JSON, and creates a new DLT from the resulting object (properties become columns), inferring the schema.
I’ve tried all sorts of things (from_json, parse_json, json.loads…) but nothing works for various different reasons. The closest I’ve gotten is just using dataframes, something like:
<sql query to get json data>
...
df = _sqldf.rdd.map(delta: row.JsonData)
spark.read.json(df).show()
But this obviously isn’t a DLT. I’ve tried in both SQL (create of refresh streaming table …) and Python and can’t figure out how to do it, if it’s possible.