In Tidyverse there are limitations concerning the row number resulting from some data processing. Most prominent, mutate
expects that the row number equals to the original data set. For example, if we want density values from a variable x we could do:
library(magrittr)
df %>%
dplyr::mutate(dx= density(x)$x,
dy= density(x)$y)
This results in an error saying something like "Caused by error:! dx must be size 100 or 1, not 512."
.
But in many situations the number of rows changes during data processing! Is there any elegant way to incorporate this into the tidyverse
coding?
All I can come up with so far is using {}
where row number changes. See following example where I make interpolation for x on y (which also changes row number):
library(magrittr)
df %>%
# Some data processing where row number stays the same
dplyr::mutate(x2= x*x,
id= 1:dplyr::n()) %>%
# Row number changes! So I use code inside {}
{time_interpolate_for <- seq(min(.$x), max(.$x), 1)
data.frame(x= time_interpolate_for,
y= approx(.$x, .$y, xout= time_interpolate_for)$y)
} %>%
# Going on with the new data and processing it so that row number remains the same
dplyr::mutate(xy_diff= x - y)
Is there a better way to do this?
Data used:
# Generate data
set.seed(1)
x <- sample(1:999, 100); y <- .5*x + rnorm(100)
df <- data.frame(x, y)