I have a data frame in R with country name, year and several variables as columns. For each country, I have data for my variables in the time span 1970-2020. The problem is that I have missing values on both ends of my dataset for most countries, so for the years 1970 and 1971 for country A or for 2019 and 2020 for country B.
So, for example:
Country-Year-column1-column2-column3
USA 1970- NA- 0,54 - NA
USA 1971 NA - 0,58 - 0,9
USA 1972 0,23 - 0,6 - 0,85
etc.
I tried to solve the problem using na.approx, which I had previously used for interpolating missing values. It had worked well when I had values before and after the missing value, but not for the marginal values now. I was able to run the code but nothing changed in my dataset, the missing values remained. Maybe this is possible with na_spline, but I didn’t know how to do it. So I opted for another approach: I want those missing values to be based on the next / last 6 values now.
My code looks like this:
columns_interpoliert<-c("life_expectancy", "economic_growth", "prison_rates"
function_rand<-function(x){
if(
is.na
(x[1])) {
first_nona<-min(which(!
is.na
(x)))
differential <- (x[first_nona + 5] - x[first_nona]) / 5
for(i in 1:first_nona - 1) {
x[i] <- x[first_nona] - differential * (first_nona - i)
}
}
if(
is.na
(x[length(x)])) {
last_nona<-max(which(!
is.na
(x)))
differential <- x[last_nona - 5] - x[last_nona] / 5
for(i in last_nona + 1:length(x)) {
x[i] <- x[last_nona] + differential * (i - last_nona)
}
}
return(x)
}
data_new <- wvs_filled %>%
group_by(country) %>%
mutate(across(columns_interpoliert, function_rand))
I get an error code here:
"Error in `mutate()`:
ℹ In argument:
across(columns_interpoliert, function_rand)
.
ℹ In group 1: country = 2
.
Caused by error in across()
:
! Can’t compute column life_expectancy
.
Caused by error in dplyr_internal_error()
:”
Does anybody know why this might happen? Any help is appreciated!
Helene is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.