I’m working with a data frame with multiple variables, and I’ve been creating correlation matrices with no problem. I just created a new variable, which is the ratio of two other variables, and tried to run correlations between that new variable and others. R returns “NaN” (Not A Number). All relevant variables are numeric, is.nan(NEW VARIABLE)
returned all FALSE, and I tried to limit the decimal places of my new variable, thinking there were too many. That did nothing. Here’s my relevant code:
S1901Income_2022_crimetest <- S1901Income2022 |>
select(HouseholdsTotal , Pct0to10k , Pct10kto14999 , Pct15kto24999...8 , Pct25kto34999 , Pct35kto49999 , Pct50kto74999 , Pct75kto99999 , Pct100kto149999 , Pct150kto199999 , Pct200korMore...22 , HouseholdMedianIncome)
ratiohighlow <- (S1901Income_2022_crimetest$Pct200korMore...22 / S1901Income_2022_crimetest$Pct0to10k)
ratiohighlow <- round(ratiohighlow , digits = 8)
Crime2022_S1901Income <- Crime2022 |>
select(-FIPS, -year)
IncomeandCrime2022 <- cbind(Crime2022_S1901Income , S1901Income_2022_crimetest , ratiohighlow)
cor(IncomeandCrime2022$TotIncidents , ratiohighlow)
In this code, “ratiohighlow” is the new, problematic variable. I’ve been running correlations between “TotIncidents” and other variables without issue.