Exponential loss for multiclass classification with sklearn’s GradientBoostingClassifier
I am dealing with a very imbalanced multiclass classification problem and trying to use sklearn’s GradientBoostingClassifier as my model. From the documentation here, I can see that the only available loss functions are ‘log-loss’ and ‘exponential’. From my research, I have seen that exponential loss grows exponentially for negative values which makes it more sensitive to outliers. This makes me think it would be better at classifying my minority classes than log loss, which grows linearly for negative values. However, when I try to use the exponential loss I get the error
xgboast scale_pos_weight if the training dataset has more positive samples than negative samples does this still balance correctly?
After reasearching I realized that scale_pos_weight is typically calculated as the ratio of the number of negative samples to the number of positive samples in the training data. My dataset has 840 negative samples and 2650 positive samples so the ratio is 0.32. If my samples were other way round I sure believe that scale_pos_weight would be a better approach.Is it save to assume that since it is less 1 then it will still balance correctly? Of course specificity is important in my study but our goal is more about recall, precision and f1 score. Can this contribute to more false positives by impacting the specificity the most?