One of my Spark code is failing due to executor container failing due to “java.lang.OutOfMemoryError: Java heap space”. Any recommendation is appreciated.
I am using emr 200 -r7g16xlarge cluster (64core and 488gb memory each) : I have defined 5 cores with 34 gb executor memory.
spark-submit --conf spark.executor.cores=5 --conf spark.executor.memory=34g --conf spark.sql.windowExec.buffer.spill.threshold=2000000 --conf spark.sql.windowExec.buffer.in.memory.threshold=2000000 --conf spark.executor.memoryOverhead=5g --conf spark.executor.instances=2400 --conf spark.driver.memory=34g --conf spark.sql.adaptive.coalescePartitions.initialPartitionNum=24000 --class comance s3://seh-gly-1.0.jar --region_id 1 --period monthly --dataset_date 2024-06-30 --partition_date 20240630 --snapshot_date 2024-09-22 --stage beta --repartition_cnt 24000 --repartition_cnt_final 7200 --accuracy 1000 --orgs "aX"