I have a dataframe o
which contains incorrect data. I have another dataframe r
with the correct data. I want to replace the subset of o
(both rows and columns) with r
. I’m seeking a solution that scales to many variables / columns. (I do have a vector rv
with the column names of the variables that need replacing)
o <- tibble(id = c(2, 3, 4, 7, 10, 2), aa = letters[1:6], bb = LETTERS[2:7], cc = 11:16)
r <- tibble(id = c(2,4), cc = c(111, 113), aa = c("aa", "cc"))
rv <- c("aa", "cc")
> o
# A tibble: 6 × 4
id aa bb cc
<dbl> <chr> <chr> <int>
1 2 a B 11
2 3 b C 12
3 4 c D 13
4 7 d E 14
5 10 e F 15
6 2 f G 16
> r
# A tibble: 2 × 3
id cc aa
<dbl> <dbl> <chr>
1 2 111 aa
2 4 113 cc
I want to end up with (I don’t care about the order of rows or columns):
> o
# A tibble: 6 × 4
id aa bb cc
<dbl> <chr> <chr> <int>
1 2 aa B 111
2 3 b C 12
3 4 cc D 113
4 7 d E 14
5 10 e F 15
6 2 aa G 111
You can assume that all ids in r
are in o
, but you cannot assume that columns or rows are the in the same order, and there are duplicated id
in o
(not in r
), where the relevant columns should also be overwritten.