I have generated 10 model summaries after fitting the same model on 10 different subsets of the dataset which is as follows
library(mice)
data("nhanes")
head(nhanes)
imp <- mice(nhanes, print = FALSE, m = 10, seed = 24415)
df <- complete(imp, "long")
model_fit <- lapply(1:10, function(i) {
model = lm(bmi ~ age + hyp + chl,
data = subset(df, `.imp`==i))
})
From this I get different ggpredict
objects
ggpredict(model_fit[[1]], c("age", "hyp"))
ggpredict(model_fit[[2]], c("age", "hyp"))
ggpredict(model_fit[[3]], c("age", "hyp"))
ggpredict(model_fit[[4]], c("age", "hyp"))
ggpredict(model_fit[[5]], c("age", "hyp"))
ggpredict(model_fit[[6]], c("age", "hyp"))
ggpredict(model_fit[[7]], c("age", "hyp"))
ggpredict(model_fit[[8]], c("age", "hyp"))
ggpredict(model_fit[[9]], c("age", "hyp"))
ggpredict(model_fit[[10]], c("age", "hyp"))
I am looking for an efficient way to a) Estimate the average of all the ggpredict
objects b) Plot this using ggplot function.
So far I tried storing the results from each ggpredict
function as list object and
`Reduce(`+`, list_ggpred)/length(list_ggpred)`
I got warning,
" In Ops.factor(left, right) : `+1 not meaningful for factors.
Any suggestions highly appreciated. Thanks.
I may have misunderstood, but one potential option could be:
library(mice)
library(ggeffects)
data("nhanes")
head(nhanes)
#> age bmi hyp chl
#> 1 1 NA NA NA
#> 2 2 22.7 1 187
#> 3 1 NA 1 187
#> 4 3 NA NA NA
#> 5 1 20.4 1 113
#> 6 3 NA NA 184
imp <- mice(nhanes, print = FALSE, m = 10, seed = 24415)
df <- complete(imp, "long")
model_fit <- lapply(1:10, function(i) {
model = lm(bmi ~ age + hyp + chl,
data = subset(df, `.imp`==i))
})
library(tidyverse)
list_of_results <- map(model_fit, ggpredict, c("age", "hyp"))
ggpredicts <- map(list_of_results, `[[`, "predicted")
map(ggpredicts, mean)
#> [[1]]
#> [1] 25.93424
#>
#> [[2]]
#> [1] 26.01019
#>
#> [[3]]
#> [1] 26.18797
#>
#> [[4]]
#> [1] 26.69359
#>
#> [[5]]
#> [1] 25.90896
#>
#> [[6]]
#> [1] 26.26845
#>
#> [[7]]
#> [1] 26.10574
#>
#> [[8]]
#> [1] 25.81957
#>
#> [[9]]
#> [1] 26.34521
#>
#> [[10]]
#> [1] 26.89521
df <- bind_cols(map(ggpredicts, mean))
colnames(df) <- paste0("Model_", str_pad(1:10, 2, pad = "0"))
df %>%
pivot_longer(everything(),
values_to = "mean prediction",
names_to = "model") %>%
ggplot(aes(x = `model`, y = `mean prediction`)) +
geom_col() +
theme_bw()
Created on 2024-04-24 with reprex v2.1.0
Is that close to your expected outcome?