High Disk Usage when using RocksDBStateStoreProvider in Spark Structured Streaming
I have a stateful Spark SS (version 3.3.1) application that processes input events in pattern as the image shown.
And I use RoksDBStateStoreProvider to maintain state in memory and disk. There are about 400M of rows in the state, and the total size is about 6G. Here are the configuration for RocksDB: