Relative Content

Tag Archive for pysparkaws-gluedelta

How can I filter and update a delta table in pyspark and save the result?

I have a delta table saved in s3, and I’m using an aws glue job to read a set of csv’s into a pyspark dataframe, and then to update the delta table by appending the dataframe rows to the delta table. Before I can do that, I need to delete the rows in the delta table that have the same date as the dates that show up in the dataframe created from the csv.