I think this might be a common problem but I could not find an answer for this.
I have yearly data and I would like to create a pooled group (i.e., a new year that combines the data from the previous year). For the combined years, I would like to pick the non-missing value from year 3. If year 3 is missing I would like to pick the maximum of the values in prior years.
here is the what the input looks like:
df <- tibble(
ID=c(1,1,1, 2,2,2, 3,3,3),
Year = c(1,2,3,1,2,3,1,2,3),
column1 = c(10,11,NA,12,11,NA,14,NA, 15),
column2 = c(8,NA,15,NA,NA,NA,NA, 14,16)
)
And the output:
df <- tibble(
ID=c(1,2,3),
Year = c("combined", "combined", "combined"),
column1 = c(11, 12, 15),
column2 = c(15, NA, 16)
)
Many thanks for your help in advance.