Spark read parquet files based on multiple partitions i.e., on DATE_KEY and BASE_FEED
I’m using PySpark to read parquet files from HDFS location partitioned by DATE_KEY. Following code always reads the file from the MAX(DATE_KEY) partition and converts to Polars dataframe.