I want to use the collapse
package to summarise the following data:
library(collapse)
df <- data.frame(id = c(1, 1, 1,
2, 2, 2),
var = c(NA, "a", "b",
"e", "f", "g"),
var1 = c("q", "w", "r",
"l", "k", NA),
other = c(4, 5, 6, 7, 8, 9))
df
# id var var1 other
# 1 1 <NA> q 4
# 2 1 a w 5
# 3 1 b r 6
# 4 2 e l 7
# 5 2 f k 8
# 6 2 g <NA> 9
into the following where I simply get the first row of var
, the last row of var1
(excluding NA
s) and the mean of other
, by id
:
# id var var1 other
# 1 1 <NA> r 5
# 2 2 e k 8
The default behaviour of na.rm
for ffirst
and flast
is to ignore NA
s:
logical. TRUE skips missing values and returns the first / last
non-missing value i.e. if the first (1) / last (n) value is NA, take
the second (2) / second-to-last (n-1) value etc..
However, I can only have na.rm = TRUE
and na.rm = FALSE
set for ffirst
and flast
at the same time (not separately for each):
# na.rm = TRUE (default)
collap(df, ~id,
custom = list(ffirst = "var", flast = "var1", fmean = "other"))
# id var var1 other
# 1 1 a r 5
# 2 2 e k 8
# na.rm = FALSE
collap(df, ~id,
custom = list(ffirst = "var", flast = "var1", fmean = "other"),
na.rm = FALSE)
# id var var1 other
# 1 1 <NA> r 5
# 2 2 e <NA> 8
Is there a way to include na.rm = FALSE
for flast
only within the collap
call?
thanks