I have a dataframe with missing NA values in column X1
and a grouping variable group
. I want to replace all NA values with a value sampled from the non-NA values of that group. This should be done for all groups except for one (group==C
). For this conditional replacement with resample data I tried if/else
and case_when
within the mutate
command of dplyr
, however without success. I guess this is because the TRUE
and FALSE
are both evaluated before assessing the condition. (The case_when condition works and selects the appropriate cases as shown when calculating X2
, however using the sample command causes a problem.)
#Original dataframe
df <-
data.frame(
id = 1:10,
group = c(rep("A",5),rep("B",4),"C"),
X1 = c(NA, 2, 1, NA,4, 3, NA, 8, 9, NA))%>%
group_by(group)%>%
mutate(X2 = case_when(is.na(X1)&group!="C"~3,
TRUE~2))
# Approach with if else (doesn't work)
df%>%
mutate(X3 = if(is.na(X1)&group!="C") sample(X1[!is.na(X1)],size=n(), replace = TRUE) else X1)
# Approach with case_when (doesn't work either)
df%>%
mutate(X3 = case_when(is.na(X1)&group!="C"~
~sample(X1[!is.na(X1)],size=n(), replace = TRUE),
TRUE~X1))