I want to take a dataset that has truth and various predictors in it and summarise the auc, ‘best’ threshold, sensitivity, and specificity for each predictor, split by some grouping variable.
Example that works for one predictor variable (“mpg”) :
mtcars %>%
group_by(am) %>%
group_modify(~ data.frame(auc = as.numeric(pROC::auc(pROC::roc(.,response="vs", predictor="mpg"))))) %>%
left_join(
mtcars %>%
group_by(am) %>%
group_modify(~ pROC::coords(pROC::roc(.,response="vs", predictor="mpg"),"best")),
by="am")
Producing this:
# A tibble: 2 × 5
# Groups: am [2]
am auc threshold specificity sensitivity
<dbl> <dbl> <dbl> <dbl> <dbl>
1 0 0.946 17.6 0.833 1
2 1 0.952 21.2 0.833 1
This is a little ugly and inefficient. Is there a less clunky way to get the same output, and also iterate over multiple predictors? Probably using map()
?