My C++ program load a CatBoost model from a .cbm file and make predictions. But the model takes too much time for me and I want to reduce its latency.
I wonder what can I do in C++ to optimize latency on such kind of models.
My C++ program load a CatBoost model from a .cbm file and make predictions. But the model takes too much time for me and I want to reduce its latency.
I wonder what can I do in C++ to optimize latency on such kind of models.