I’ve encountered an issue while implementing Spark DataFrame partition overwrite in a Synapse Notebook environment. Despite successfully removing specified account numbers from partitions, if a partition becomes empty as a result, it’s not being overwritten.
Code implementation :
Set dynamic partition overwrite mode
spark.conf.set(“spark.sql.sources.partitionOverwriteMode”, “dynamic”)
Define partitioned path
partitioned_path = “<partitioned_path>”
Define partition columns
partition_columns = [“region”, “partition_key”]
Step 2: Create DataFrame with more data and partitions
data = [
(“123”, 10, “region1”, “partition1”),
(“456”, 20, “region2”, “partition2”),
(“789”, 30, “region1”, “partition1”),
(“234”, 15, “region1”, “partition1”),
(“567”, 25, “region2”, “partition2”),
(“890”, 35, “region3”, “partition3”)
]
df = spark.createDataFrame(data, [“acc_no”, “value”, “region”, “partition_key”])
Step 3: Write DataFrame to Partitioned Path
df.write.partitionBy(*partition_columns).mode(“overwrite”).parquet(partitioned_path)
Step 4: Read Partitioned Data and Filter
acc_nos_to_remove = [“456”, “567”, “123”]
try:
# Read partitioned data
partitioned_df = spark.read.option(“basePath”, partitioned_path).parquet(partitioned_path)
# Filter DataFrame to remove specified account numbers
filtered_df = partitioned_df.filter(~partitioned_df.acc_no.isin(acc_nos_to_remove))
# Step 5: Update Partition
filtered_df.write.format('parquet').option('header',True).mode('overwrite').option("maxPartitionBytes", 128*1024*1024).save(partitioned_path)
print("Partition data updated successfully.")
except Exception as e:
print(f”Error occurred while reading or writing Parquet files : {e}”)
Note : It is working fine in Databrics but not in synapse.
I expected that when removing i.e account numbers from partitions, if any partition becomes empty, it would be overwritten with the updated data. However, in Synapse Notebook, empty partitions are not being overwritten as expected. Instead, they seem to persist with their previous data.
Any help or insights on resolving this issue would be greatly appreciated.
Xy Z is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.