I have a dataframe:
mydf <- data.frame(
col1 = c("54", "abc", "123", "54 abc", "zzz", "a", "99"),
col2 = c("100", "200", "300", "400", "500", "600", "700"),
stringsAsFactors = FALSE
)
In this dataframe, I want to replace all elements with NA unless they meet one of these conditions:
- strictly a number (e.g. “54” keep, “54 abc” discard)
- belong to target_string
I was not sure how to do this in R using apply, so I tried to write a loop:
target_string <- c("a", "zzz")
replace_with_na_old <- function(df, target_string) {
for (i in 1:nrow(df)) {
for (j in 1:ncol(df)) {
value <- df[i, j]
if (!grepl("^[0-9]+$", value) && !(value %in% target_string)) {
df[i, j] <- NA
}
}
}
return(df)
}
mydf_cleaned_old <- replace_with_na_old(mydf, target_string)
Is there another way to do this?
New contributor
farrow90 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.