I am implementing an encryption and decryption transformation function that takes a parquet file as input and encrypts/decrypts each column of each row using AES-CTR and writes it back out to a destination location. (I am aware of Parquet Modular Encryption but that is not an option right now for various reasons).
I am using Arrow to read, process and write encrypted/decrypted parquet files. I am doing this in C++ due to performance reasons.
I was able to read the parquet file, record batch by record batch and for each record batch, I am attempting to apply encryption/decryption to each column array using my own provided compute function. I am following the pattern shown getting started guide for arrow.
In order to proceed, I need to add my encryption/decryption function. I am doing something like the following
arrow::compute::ScalarFunction encryptfunc = arrow::compute::ScalarFunction("encr", arrow::compute::Arity::Unary(), arrow::compute::FunctionDoc::Empty());
const arrow::Status &encrstatus = encryptfunc.AddKernel({arrow::float64()}, arrow::binary(), arrow::compute::ArrayKernelExec(???));
Now I am unsure where I can plug in my lambda function that encapsulates the key and encryption logic? Is this even possible?