I’m working with a table that contains both standard data types (like strings and integers) and nested JSON data in some columns. Here’s an example of the table structure:
| id | event_name | event_params |
|----|-------------|------------------------------------------------|
| 1 | Item1 | [{key=name_attribute1, value={...}}, {...}] |
| 2 | Item2 | [{key=name_attribute2, value={...}}, {...}] |
The nested data in the ‘event_params’ column follows exactly this structure:
'[{key=name_attribute1, value={string_value=some-string, int_value=null, float_value=null, double_value=null}}, {key=name_attribute2, value={string_value=some-value, int_value=null, float_value=null, double_value=null}}, {...}, {...}]'
(there are many dicts, with different keys in these columns, and there are keys that just appear in some rows)
I want to unnest these data to create new columns for each key while retaining the data from the non-nested columns. Ideally, I would like to achieve this in SQL. However, if that’s not feasible, I’m open to using Python as well.
Could someone please guide me on how to transform this nested structure into a flat table format either directly in SQL or using Python?
Thanks in advance for your help!
What i tried?
I tried do unnest the data using SQL, but it duplicate the lines.
What i expect?
I expect to have a table like this, with unnest data tranformed into columns. Important: I do not need every key, just selected ones.
| id | event_name | name_attribute1 | name_attribute2 | name_attribute_n
|----|-------------|--------------------|----------------------|------------------
| 1 | Item1 | some_string | some_string | ...
| 2 | Item2 | some_string | some_string | ...