I am using XGBClassifier
for a multiclass classification problem. I used GridSearchCV
to tune the hyperparameters. I thought the F1 score would be the best evaluation metric for the problem. Here’s a code snippet where I am trying to optimize max_depth
and min_child_weight
:
# Updating our default model with the optimal number of estimators
xgb1 = XGBClassifier(objective= 'multi:softprob', num_class=len(df['Diagnosis'].unique().tolist()), learning_rate = 0.1, n_estimators=10, eval_metric='auc')
# Array of values for max_depth and min_child_weight parameters
param_test1 = {'max_depth':range(3,40,1),'min_child_weight':range(1,3,1)}
# Grid search with cross-validation using the updated model and parameter value array
gsearch1 = GridSearchCV(estimator = xgb1, param_grid = param_test1, scoring='f1', cv=5)
gsearch1.fit(trn_xs[features],trn_y.cat.codes)
gsearch1.cv_results_['params'], gsearch1.best_params_, gsearch1.best_score_
However, I keep getting the following error:
UserWarning: Scoring failed. The score on this train-test partition for these parameters will be set to nan. Details:
Traceback (most recent call last):
File "/opt/conda/lib/python3.10/site-packages/sklearn/model_selection/_validation.py", line 767, in _score
scores = scorer(estimator, X_test, y_test)
File "/opt/conda/lib/python3.10/site-packages/sklearn/metrics/_scorer.py", line 234, in __call__
return self._score(
File "/opt/conda/lib/python3.10/site-packages/sklearn/metrics/_scorer.py", line 282, in _score
return self._sign * self._score_func(y_true, y_pred, **self._kwargs)
File "/opt/conda/lib/python3.10/site-packages/sklearn/metrics/_classification.py", line 1146, in f1_score
return fbeta_score(
File "/opt/conda/lib/python3.10/site-packages/sklearn/metrics/_classification.py", line 1287, in fbeta_score
_, _, f, _ = precision_recall_fscore_support(
File "/opt/conda/lib/python3.10/site-packages/sklearn/metrics/_classification.py", line 1573, in precision_recall_fscore_support
labels = _check_set_wise_labels(y_true, y_pred, average, labels, pos_label)
File "/opt/conda/lib/python3.10/site-packages/sklearn/metrics/_classification.py", line 1391, in _check_set_wise_labels
raise ValueError(
ValueError: Target is multiclass but average='binary'. Please choose another average setting, one of [None, 'micro', 'macro', 'weighted'].
I can’t include average=None
in the GridSearchCV
function as the error message instructs. What do I do?