I want to generate x and y variables with a controlled correlation. Currently, I’m using the following code to generate x and y (below). However, I want create a function where I can specify the correlation between x and y when simulating them. Does anyone know how I can adjust this code to achieve a specific correlation between x and y? The error terms erorr_x and erorr_y have mean zero and are independent.
n = 100
sigma_x = 3
sigma_y = 3
beta = c(5,2)
SIM_DATA <- function(n=100, sigma_x, sigma_y, beta, rho, MRep=50){
df <- list(NULL)
for(i in 1:MRep){
sim_data <- matrix(NA, nrow = n, ncol = 2)
erorr_x <- rnorm(n, mean = 0, sd = 1)
erorr_y <- rexp(n, rate = 1) - 1
x <- sigma_x * erorr_x
y <- beta[1] + beta[2] * x + sigma_y * erorr_y
sim_data[, 1] <- x
sim_data[, 2] <- y
df[[length(df) + 1]] <- sim_data
return(df)
}
New contributor
mvoreo is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.