In Synapse Serverless, I ran OPTIMIZE on a delta table to compact the files / hopefully improve performance.
To start, a million – often tiny, parquet files. My understanding is that’s suboptimal for performance
So then run an OPTIMIZE command on that table.
Great, it completed. Go back and…
- Read performance is unchanged
- The files on the datalake appear unchanged as well (identical screenshot – file names, sizes, etc. Granted that’s only a subset of the files but…it’s still identical, and what are those 7kb files still doing there?)
What am I missing and/or misunderstanding?
2