Here is how my data looks like:
# Example data. Lots of more variables liks x_ and y_
df <- data.frame(id = c(1,2,3), x_num = c(9,8,7), y_num = c(6,5,4),
x_text = c("a","b","c"), y_text = c("d","e","f"),
z = c(T,F,F))
Here is the desired output:
# Long
target <- data.frame(id = c(1,1,2,2,3,3), type = c("x","y","x","y","x","y"),
num = c(9,6,8,5,7,4), text = c("a","d","b","e","c","f"),
z = c(T,T,F,F,F,F))
So, this is a wide to long conversion, with a unique aspect: some variable have two measures, prefixed with x_
and y_
. It is enough then to create one variable “type” which specifies the measure, for all the variables. Also, the rest of the variable name should be part of the name of another variable, which contain the values.
So x_name = 5
should result in type=x
and name = 5
.
And this should of course respect the other variables with are unique, like id
and z
.
I’ve been looking at this page, and my two unsuccessful attempts are:
# This can do one (x,y) pair at a time, not efficient.
df[c("x_num","y_num")] %>%
pivot_longer(
cols = `x_num`:`y_num`,
names_to = "type",
values_to = "values"
) %>%
mutate(type=substring(type,1,1))
# Not desired output
reshape(df,
direction = "long",
varying = list(names(df)[2:5]),
v.names = "values",
idvar = c("id"),
timevar = "type")
I do not use data.table, so rather would use dataframes. Any ideas?