I am running Spark 3.5.0 with Scala 2.12 and had a question about mapPartition. I am creating a mutable list inside the mapPartitions and inserting items inside the iterator. When I try to get the items after the iterating over all the items in the partition, the list is empty. Does anyone know if this expected? Here is contrived example of the code:
import org.apache.spark.sql.catalyst.encoders.ExpressionEncoder
val encoder = ExpressionEncoder(df.schema)
df.mapPartitions(it => {
val list = mutable.ListBuffer[Row]()
val result = it.map(row => {
list += row
row
})
println(s"list.size: ${list.size}")
result
})(encoder).show()