Parquet partition performance with where clause
I’m trying to optimize query performance for a PySpark SQL query of parquet files in Azure Synapse Analytics. My data set is billions of records, so any bit of performance I can get is great. My basic question is does the columnar storage of parquet really help me with my where clause for Year, or must I use the /Year=2023 with the OPENROWSET method to get that real performance boost?