6-bit lookup using SIMD AVX2
I am trying to get the 6-bit lookup on SIMD AVX2 correct. I am splitting the 6 bits into lower 4 bits and high 2 bits, the lower 4 are used for the shuffle operation, and subsequently blending the results with the appropriate masks on. The logic seems fine to me and need help understanding what I am doing wrong. The values are kinda close compared to the scalar equivalent, but incorrect.