Relative Content

Tag Archive for scalaapache-sparkspark-structured-streaming

Convert nested avro structures to flat schema in Apache Spark

I have a use case where I have to read data from Kafka and write to a Sink . The data in kafka is in avro and the fields are wrapped in an avro map. The map will not have same keys always and will vary with the type of data. I have to flatten this map i.e. convert each key to a column and its a value is a value in the column. I have a standard schema to flatten this to before writing to Sink. I am writing this data in delta format / parquet on the Sink How can i approach this?