I am trying to do a stepwise regression to find the best predictors for my model but when running the code in r I run into an error.
I used the step() function on my data set but I got the error
reg<-lm(depression~.,data=dataset)
stepAIC(reg, direction="both")
>Error in stepAIC(reg, direction = "both") :
AIC is -infinity for this model, so 'stepAIC' cannot proceed
I am using a data set of counties in the US and corresponding demographic and geograpic data. The dataset has 3135 observations so I am only including the first 6 rows.
> head(dataset)
# A tibble: 6 × 17
county state depression FIPS size total black white asian latino native pacificIslander other mixed diversity density classification
<chr> <chr> <dbl> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr>
1 Abbeville SC 22.3 45001 491. 24368 6372 16658 58 441 28 0 138 673 0.463 49.6 Small
2 Acadia LA 27.3 22001 655. 57674 9511 44413 95 1780 13 0 40 1822 0.378 88.0 Small
3 Accomack VA 18.8 51001 449. 33367 9446 19813 268 3084 56 0 55 645 0.558 74.3 Small
4 Ada ID 20.3 16001 1052. 497494 5815 411348 12672 44317 1668 990 2384 18300 0.306 473. Medium
5 Adair IA 16.6 19001 569. 7479 48 7046 9 196 0 12 49 119 0.111 13.1 Small
6 Adair KY 26.1 21001 405. 18887 307 17308 59 468 3 24 6 712 0.158 46.6 Small
Does anyone know what is causing this and how I can fix it?
(For reproducibility)
> dput(head(dataset))
structure(list(county = c("Abbeville", "Acadia", "Accomack",
"Ada", "Adair", "Adair"), state = c("SC", "LA", "VA", "ID", "IA",
"KY"), depression = c(22.3, 27.3, 18.8, 20.3, 16.6, 26.1), FIPS = c("45001",
"22001", "51001", "16001", "19001", "21001"), size = c(491.195,
655.244, 449.32, 1052.015, 569.271, 405.292), total = c(24368,
57674, 33367, 497494, 7479, 18887), black = c(6372, 9511, 9446,
5815, 48, 307), white = c(16658, 44413, 19813, 411348, 7046,
17308), asian = c(58, 95, 268, 12672, 9, 59), latino = c(441,
1780, 3084, 44317, 196, 468), native = c(28, 13, 56, 1668, 0,
3), pacificIslander = c(0, 0, 0, 990, 12, 24), other = c(138,
40, 55, 2384, 49, 6), mixed = c(673, 1822, 645, 18300, 119, 712
), diversity = c(0.463214524775288, 0.377844145794569, 0.558287177093522,
0.306246361818157, 0.111453827636757, 0.157904845203141), density = c(49.6096255051456,
88.0191195951432, 74.2611056707914, 472.896299007143, 13.1378552569866,
46.600969177778), classification = c("Small", "Small", "Small",
"Medium", "Small", "Small")), row.names = c(NA, -6L), class = c("tbl_df",
"tbl", "data.frame"))