Relative Content

Tag Archive for pysparkaws-glueapache-iceberg

pySpark with iceberg saving multiple small files

I have a pyspark job running on Glue. My job processes the data and saves it as Apache Iceberg. The problem is, the save table generates multiple small files within partitions. I tested several ways of saving data, but none resolved. Here follows my snipped code.