I have some questions about Catboost feature importances.
-
About Prediction value change. If we consider it in a binary classification problem, then the importance weights are not at all similar to the probabilities (values greater than 1). Or are the forecasts somehow scaled?
-
Is the formula for InternalFeatureImportance correct? It exactly duplicates the formula for PredictionValuesChange.
-
And my last question is about PredictionDiff. The documentation says that for each feature, PredictionDiff reflects the maximum possible change in the difference in predictions if the value of the feature changes for both objects.
How exactly? Are x_i, x_j taken as close as possible to each other and changes are made to the feature of interest, or is the forecast for x_i and x_i’ with the transformed feature of interest considered?
I searched information in the paper and in the tg chat, but I couldn’t find anything.
Sabrina Sadiekh is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.