I had to write a function today like this
data1 %>%
summarise(
ab1 = fn(a1, b1),
ab2 = fn(a2, b2),
ab3 = fn(a3, b3)
)
# imagine if there are 100 of them
If fn
was a single argument function I could’ve done
data1 %>%
summarise(across(starts_with("a", fn)))
But unfortunately, my function needs two columns as inputs. Is there a way to do this without writing a new line for every set of arguments?
0
You may use map2*
functions to pass two set of columns.
library(dplyr)
library(purrr)
data1 %>%
summarise(map2_df(pick(starts_with("a")), pick(starts_with("b")), fn))
# a1 a2 a3
#1 21 57 93
Using data from @ThomasIsCoding but a different function since your code uses summarise
it means it will have a single row at the end.
fn <- function(a, b) {
sum(a, b)
}
3
Another approach using reshaped data. If you can get over the hurdle of reshaping back and forth from longer form, the calculation would be trivial.
One benefit of this approach is that it is robust to column order, and you don’t need to specify the column prefixes upfront, provided there is some regular pattern you can specify with regex.
library(tidyverse)
data1 |>
# reshape long, in this case assuming the columns are all (letters)(numbers).
mutate(row = row_number()) |>
pivot_longer(cols = -row,
names_to = c(".value", "Pair"),
names_pattern = "(\D+)(\d+)") |>
# do the calculation with the two or more involved columns
mutate(ab = a*b, .by = c(row, Pair)) |>
# reshape wider again
pivot_wider(names_from = Pair, names_glue = "{.value}{Pair}", names_vary = "slowest",
values_from = a:ab)
Output using data from @ThomasIsCoding:
row a1 b1 ab1 a2 b2 ab2 a3 b3 ab3
<int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1 1 4 4 7 10 70 13 16 208
2 2 2 5 10 8 11 88 14 17 238
3 3 3 6 18 9 12 108 15 18 270
Probably you can try split.default
to split columns into groups by their names, e.g.,
data1 %>%
split.default(sub("\D+", "ab", names(.))) %>%
map_dfr((...) do.call(fn, unname(...)))
which gives
# A tibble: 3 × 3
ab1 ab2 ab3
<dbl> <dbl> <dbl>
1 4 70 208
2 10 88 238
3 18 108 270
data example
data1 <- data.frame(
a1 = c(1, 2, 3),
b1 = c(4, 5, 6),
a2 = c(7, 8, 9),
b2 = c(10, 11, 12),
a3 = c(13, 14, 15),
b3 = c(16, 17, 18)
)
fn <- function(a, b) {
a * b
}