When I process the following data containing NA the max function prompts a warning message.
For the following data df, I want to calculate the maximum value for each group.
df <- data.frame(Var_1 = c("Grp 1", "Grp 1", "Grp 1", "Grp 2", "Grp 2", "Grp 2", "Grp 3", "Grp 3", "Grp 3"),
Var_2 = c(1,2,3, NA, NA, NA, 7, NA, 9))
ck <- df %>%
group_by(Var_1) %>%
summarise(max_var = max(Var_2, na.rm = TRUE))
print(ck)
Var_1 max_var
<chr> <dbl>
1 Grp 1 3
2 Grp 2 -Inf
3 Grp 3 9
If I don’t add condition na.rm = TRUE
to the max function, the result is NA for Grp 3. If I add the condition na.rm = TRUE
, the records for Group 2 are deleted, and then the warning message “no non-missing arguments to max; returning -Inf” is thrown.
The result I was hoping for was
Var_1 | max_var |
---|---|
Grp 1 | 3 |
Grp 2 | NA |
Grp 3 | 9 |
Does anyone have any suggestions on how to handle this WARNING message and results? Thanks.
I’ve tried deleting all the NA results first and then calculating with summarize(max()), but this causes Grp 2 to disappear. I want to keep this group and let the result is NA.
df <- data.frame(Var_1 = c("Grp 1", "Grp 1", "Grp 1", "Grp 2", "Grp 2", "Grp 2", "Grp 3", "Grp 3", "Grp 3"),
Var_2 = c(1,2,3, NA, NA, NA, 7, NA, 9))
ck <- df %>%
filter(!is.na(Var_2)) %>%
group_by(Var_1) %>%
summarise(max_var = max(Var_2, na.rm = TRUE))
print(ck)
Var_1 max_var
<chr> <dbl>
1 Grp 1 3
2 Grp 3 9
Songlin Tong is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.