I have two datasets as the ones described below:
dfA <- tibble(
name = c("John", "Michael", "Brian", "Thomas", "Peter"),
expected = c(128.34, 453.6674, 789.2345, 678.34, 475.467),
max = c(128.3464, 453.6901, 789.2740, 678.3739, 475.4908),
min = c(128.3336, 453.6447, 789.1950, 678.3061, 475.4432)
)
dfB <- tibble(
real = c(128.76, 453.658, 789.235, 677.2035, 475.4695),
time = c(3.24, 5.65, 4.89, 2.28, 6.12)
)
dfA$max
and dfA$min
have been previously calculated using the following functions:
shift <- function (theoretical, tolerance) {
return(tolerance*theoretical/10^6)
}
calculate_max <- function (vector, tolerance) {
res <- c()
for (i in vector) {
max <- i + shift(i, tolerance)
res <- append(res, max)
}
return(res)
}
calculate_min <- function (vector, tolerance) {
res <- c()
for (i in vector) {
min <- i - shift(i, tolerance)
res <- append(res, min)
}
return(res)
}
I’d like to intersect both datasets so that dfB$real
column lies between dfA$min
and dfA$max
columns.
I have already found a solution using the fuzzyjoin
library, but I’m intrigued about how to do this manually, as iterating over rows with tibbles does not seem to be a good practice from what I’ve read and I cannot think of another way of achieving this without looping over rows.
The working solution with fuzzyjoin
would be as follows (in case it’s helpful to anyone):
fuzzy_right_join(dfB, dfA,
by = c("real"="max", "real"="min"),
match_fun = list(`<=`, `>=`))
Thanks for your help!