In my data frame, I have 15 columns:
- Subject ID
- A set of 7 columns with the subject’s age at particular time points (age1, age2, etc)
- A set of 7 columns with the subject’s scores at those particular time points (corresponding to the ages above; score1, score2, etc).
Most participants only have age1 and score1 (i.e., they only obtained a score at a single time point), but some will have more if they were tested at multiple time points.
I would like to create two new columns:
- minScore: The minimum value out of out of columns score1:score7, ignoring NAs.
- scoreAge: The subject’s age corresponding to the time point of their minimum score. For example, if a subject’s lowest score is the value of score3, I want this column to have the value of age3, etc. This could be NA if the subject’s age is missing for that time point.
data <- structure(list(subject_id = c("191-11173897", "191-11561329",
"191-11700002", "191-11857141", "191-11933910"), age1 = c(39,
7, NA, NA, 16), age2 = c(36, NA, NA, NA, 37), age3 = c(9, NA,
NA, NA, NA), age4 = c(NA_real_, NA_real_, NA_real_, NA_real_,
NA_real_), age5 = c(NA_real_, NA_real_, NA_real_, NA_real_, NA_real_
), age6 = c(NA_real_, NA_real_, NA_real_, NA_real_, NA_real_),
age7 = c(NA_real_, NA_real_, NA_real_, NA_real_, NA_real_
), score1 = c(10.6, 12.1, 9.8, NA, 10.6), score2 = c(9.8,
NA, NA, NA, 11), score3 = c(11.3, NA, NA, NA, NA), score4 = c(NA_real_,
NA_real_, NA_real_, NA_real_, NA_real_), score5 = c(NA_real_,
NA_real_, NA_real_, NA_real_, NA_real_), score6 = c(NA_real_,
NA_real_, NA_real_, NA_real_, NA_real_), score7 = c(NA_real_,
NA_real_, NA_real_, NA_real_, NA_real_)), row.names = c(NA,
-5L), class = c("tbl_df", "tbl", "data.frame"))
New contributor
mr1890 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.