When using getField() I am able to return simple values but when I try and use it on an array of objects it returns None. So I’ve tried to create a simple version of the Scala code I’m having problems with below I’m trying to run a Glue job and as part of that do some mapping on a dynamicFrame. I’m aware I could just switch to a data frame to do this but I’m curious why the dynamic frame doesn’t work as expected.
// some code to get the source
//
// example object
// {
// "name" : "Mike",
// "age": 43,
// "kids : [
// {"age" : 10, "name" : "Jack"},
// {"age" : 13, "name" : "Jill"}
// ]
// }
val exampleFrame = dataSource.getDynamicFrame()
val mappedExampleFrame = exampleFrame.map { record =>
println("does kids exist = " + record.schema.containsField("kids")) //this returns true
val name = record.getField("name").getOrElse("NA").toString //returns Mike
val age = record.getField("name").getOrElse(0).asInstanceOf[Int] //returns 43
val kids = record.getField("kids").getOrElse(Seq()).asInstanceOf[List[Map[String,Any] // returns an empty Sequence
// do some mapping
}
// some sink code
I have also do some other debugging and confirmed that when just using get instead of getOrElse it does return a none type as the getField() returns an option but this doesn’t make sense to me given that we’ve confirmed it is in the schema and I have confirmed the record definitely does have a value there. Like I said I’m aware this could be bypassed by just using a data frame (and in practice I have done) but I’d still like to know why the dynamic frame doesn’t work as expected.
additionally I’ve also tried
using a case class
val kids = record.getField("kids").getOrElse(Seq()).asInstanceOf[List[Kids] // returns an empty Sequence
and
pulling it out as it's own record
val kids = record.getField("kids").getOrElse(Seq()).asInstanceOf[List[DynamicRecord] // returns an empty Sequence
also its glue v4 if that makes a difference and spark 3.3 with scala 2.12.18