I have a relatively long list of low bit integers by size 16*10^6.
The sparsity level of this list is about 40%, so applying any sparse matrix does not work.
The first step I tried Run-length encoding, which does not work here, there are not so many repeating values.
Moreover, I need an algorithm with fast decoding, that is quite crucial, encoding can be as long as it required.
Huffman and Zstandard on average lead to x3 compression, which is good. However, Huffman with such length does not meet time decoding criteria.
While Zstandard is good, I was thinking if it is possible to do something better based on the fact that there are only 16 unique values (considering 4 bits). Also, I have no idea what Zstandard does inside and how it works.
It is not necessary to have a very fast implementation, just a proof of concept would suffice.
For that, I looked into dictionary coding but most of the algorithms assume int32 or FP32 data formats, so the overhead for the code-book and indexing beats the purpose.
Also, while there some solutions for sorted integers list in my case the list is not sorted.
I also looked into how the BMP format works, but there is no clear evidence it helps to compress the data. So, any pointers would be appreciated.
The main requirements are that the data is low bit and fast decoding is important.