I have a column ‘user_contacts_attributes ‘ in spark dataframe:
+——————+————————————-+
| user_name |user_contacts_attributes |
+——————+————————————-+
| Test | “user”: { |
| | “id”: “16”, |
| | “username”: “sam”, |
| | “level”: “2.00” |
| | } |
+——————+————————————-+
‘user_contacts_attributes ‘ has a schema as below:
user_name:string
user_contacts_attributes:struct
user:struct
id:string
level:string
username:string
The resultant dataframe has to be as below:
+——————++——————+—————————+
| user_name |parent |child | value |
+——————+—————————++——————+
| Test|user | id | 16 |
| Test|user |level | 2.00|
| Test|user |username | sam |
+——————+—————————+——————-+
I have tried writing UDF similar to this PySpark “explode” dict in column
But failed.