I have the following problem:
There seems to be a problem with as.mids() in R. The values in a long-Format differ from those after as.mids() was applied. In the first I have a difference for scaled data, in the second the median split is incorrect after transformation.
Here are my scripts:
imp_selected_mids <- complete(imp_selected_mids, "long", include=T)
> imp_selected_mids <- imp_selected_mids %>%
+ group_by(.imp) %>%
+ mutate(dar_scale = scale(DAR_TOTAL))%>%
+ ungroup()
> imp_selected_mids$dar_scale <- as.numeric(imp_selected_mids$dar_scale)
> imp_selected_test15 <- subset(imp_selected_mids, .imp==15)
> imp_selected_test9 <- subset(imp_selected_mids, .imp==9)
> imp_selected_mids <- as.mids(imp_selected_mids)
> #test
> select_dar15 <- imp_selected_test15$dar_scale
> test15_select_imp <- complete(imp_selected_mids, 15)
> select_dar15_mids <- test15_select_imp$dar_scale
> print(all.equal(select_dar15, select_dar15_mids))
[1] "Mean relative difference: 0.01216397"
#Median split for substance use
> test <- imp_selected_mids
> test <- complete(test, "long", include=T)
> test <- test%>%
+ group_by(.imp) %>%
+ mutate(
+ median_sub_all = median(sum_sub_all, na.rm = TRUE)) %>%
+ ungroup()
> test$median_sub_all <- as.numeric(test$median_sub_all)
> test$sub_binary <- ifelse(test$sum_sub_all < test$median_sub_all, 1, 2)
> str(test$sub_binary)
num [1:1092] 1 NA NA 1 NA NA 1 1 1 2 ...
> test$sub_binary <- as.integer(test$sub_binary)
> table(test$sub_binary)
1 2
530 535
> test1 <- subset(test, .imp==1)
> test15 <- subset(test, .imp==15)
> test <- select(test, -median_sub_all)
> test <- as.mids(test)
> #Store each dataset seperately in the environment to access it
> for (i in 1:20) {
+ test_data <- mice::complete(test, action = i)
+ assign(paste0("test_data_", i), test_data)
+ }
> table(test_data_1$sub_binary)
1 2
23 29
> table(test1$sub_binary)
1 2
26 26
Thanks in advance.
I tried working with the code and evaluated whether the difference also occurs in other ways.