To preempt, I’ve figured out a solution to my problem using left_join()
(or merge()
), but I don’t really understand why ifelse()
doesn’t work for this problem. Would love to hear any other ways I could do this or improve my left_join()
use. Sorry if this is a long post.
Basically, I’m trying to create a new column in a data frame df1
by matching the observations in a column df1$code
to the observations in a corresponding column in an indexing df index.df$code
. The new column df1$type
would be the values from index.df$type
that correspond to the df1$code
value:
#index data frame
index.df <- data.frame(
code = c("c10", "c20", "c03", "c48", "c19"),
id = c("apple", "strawberry", "pear", "banana", "blackberry"),
type = c("pome", "aggregate", "pome", "berry", "aggregate")
)
> index.df
code id type
1 c10 apple pome
2 c20 strawberry aggregate
3 c03 pear pome
4 c48 banana berry
5 c19 blackberry aggregate
#df to add col to
df1 <- data.frame(
code = c("c10", "c19", "c03", "c20", "c19", "c10", "c48", "c03", "c10", "c03"),
id = c("apple", "blackberry", "pear","strawberry", "blackberry", "apple", "banana", "pear", "apple", "pear")
)
> df1
code id
1 c10 apple
2 c19 blackberry
3 c03 pear
4 c20 strawberry
5 c19 blackberry
6 c10 apple
7 c48 banana
8 c03 pear
9 c10 apple
10 c03 pear
And this is the desired output
> df2
code id type
1 c10 apple pome
2 c19 blackberry aggregate
3 c03 pear pome
4 c20 strawberry aggregate
5 c19 blackberry aggregate
6 c10 apple pome
7 c48 banana berry
8 c03 pear pome
9 c10 apple pome
10 c03 pear pome
I tried ifelse()
this way:
df2 <- df1 %>%
mutate(df1, type = ifelse(df1$code == index.df$code, index.df$type, NA))
> df2
no code id type
1 1 c10 apple pome
2 2 c19 blackberry <NA>
3 3 c03 pear pome
4 4 c20 strawberry <NA>
5 5 c19 blackberry aggregate
6 6 c10 apple pome
7 7 c48 banana <NA>
8 8 c03 pear pome
9 9 c10 apple <NA>
10 10 c03 pear <NA>
Why is this the output? Am I using ifelse()
incorrectly? Thank you in advance!
Also, the (rather bulky) code I used to get my desired output was:
df1 <- data.frame(
no = 1:10,
code = c("c10", "c19", "c03", "c20", "c19", "c10", "c48", "c03", "c10", "c03"),
id = c("apple", "blackberry", "pear","strawberry", "blackberry", "apple", "banana", "pear", "apple", "pear")
)
df2 <- index.df %>%
left_join(df1, by = c("code", "id")) %>%
arrange(no) %>%
select(-no)