I have an ensemble TabularPredictor
comprised of only LightGBM models. During training, multithreading/multiprocessing worked fine, but at inference time the predictor seems to only be using one CPU and therefore takes a very long time.
My inference data is large (~141 million samples), but it seems to be taking much longer than usual, e.g. tens of hours. Looking at my resource usage, I see that only one CPU is being used during inference.
Aside from manually splitting the inference data into batches and parallelizing the prediction myself, is there a way to instruct the ensemble model to use more cores?
I assume I could do this by accessing the actual LightGBM model, but since it is an ensemble, I don’t know how to access the underlying trainers to manually update the parameters such as num_threads
.