I am testing the performance of a prediction model (binary scenario: 0 or 1) using tidymodels in R. I have created importance weights before the fitting process for all individuals in my dataset. I have done the data splitting part and the full workflow, but I would like my performance metrics (roc_auc and brier_class) to be weighted as well, based on the importance weights.
Right now I have:
set.seed(234)
data_folds <- group_vfold_cv(data,group = Hospital)
# Specify log reg
model <- logistic_reg(mode = "classification",engine = "glm")
# Workflows now
current_wf <- workflow() |>
add_case_weights(imp_weights) |>
add_formula(Status30D ~ Max_NEWS) |>
add_model(model)
current_wf
# Set up parallel processing
doParallel::registerDoParallel(cores = 6)
cntrl <- control_resamples(save_pred = T)
# Internal-External validation of the current EWS (checking demographic parity also)
current_fit <- fit_resamples(current_wf,resamples = data_folds,
metrics = metric_set(roc_auc,brier_class)
,control = cntrl)
I know I can do something like that:
current_fit |>
collect_predictions() |>
arrange(.row) |>
mutate(weights = data$imp_weights) |>
roc_auc(Status30D,.pred_Deceased,case_weights = weights)
But that will not give me standard errors.
Is there a way to specify my need for weighted performance metrics in the fit_resamples function or somewhere else?