I want to merge duplicate rows in a table. It should sum together all metrics in rows if the dimensions are the same.
| date | page | users | sessions |
| ---------- | ------ | ----- | -------- |
| 2024-05-01 | a.html | 20 | 22 |
| 2024-05-01 | b.html | 14 | 16 |
| 2024-05-01 | a.html | 2 | 3 |
Result:
| date | page | users | sessions |
| ---------- | ------ | ----- | -------- |
| 2024-05-01 | a.html | 22 | 25 |
| 2024-05-01 | b.html | 14 | 16 |
Here is a lengthy code that does exactly what I need:
if ("page" %in% colnames(data)) {
if ("segment" %in% colnames(data)) {
data <- aggregate(. ~ date + page + segment, data, sum)
} else {
data <- aggregate(. ~ date + page, data, sum)
}
} else if ("campaign" %in% colnames(data)) {
if ("channelGrouping" %in% colnames(data)) {
data <- aggregate(. ~ date + campaign + channelGrouping, data, sum)
} else {
data <- aggregate(. ~ date + campaign, data, sum)
}
} else if ("channelGrouping" %in% colnames(data)) {
data <- aggregate(. ~ date + channelGrouping, data, sum)
} else {
data <- aggregate(. ~ date, data, sum)
}
# ... other similar lines of code here
data <- data[order(data$date), ]
I want to simplify that so it uses a dynamic list of dimensions so I don’t need to use a lot of IF conditions.
Changing:
data <- aggregate(. ~ date + page, data, sum)
To:
data <- aggregate(. ~ insert_dynamic_dimensions_list, data, sum)
But the aggregating line shows a wrapup error:
columns = c("date")
if ("page" %in% colnames(data)) columns <- c(columns, "page")
if ("campaign" %in% colnames(data)) columns <- c(columns, "campaign")
if ("channelGrouping" %in% colnames(data)) columns <- c(columns, "channelGrouping")
if ("segment" %in% colnames(data)) columns <- c(columns, "segment")
data <- aggregate(. ~ .[columns], data, sum) ### <== This throws aggregating error
data <- data[order(data$date), ]
3