I am trying predict the occurrence of a species (presence/absence) using a GAM in r. I have fit the GAM, but have two questions.
- Why aren’t my predictions binary (ie 1/0 for presence/absence)?
- Instead of creating a testing dataset, can I predict occurrence over an entire raster stack using terra::predict()?
library(terra)
library(mgcv)
#polygons to generate random points within
v <- vect(system.file("ex/lux.shp", package="terra")) |> st_as_sf()
#get a raster from terra
r <- rast(system.file("ex/elev.tif", package="terra"))
# Create 50 random points
set.seed(50)
pnts <- st_sample(v, size = 50, type = "random") |> st_as_sf()
# Return the exact raster value the point lies on
pnts_r <- terra::extract(r, pnts, method = "simple")
# Add occurrence column
pnts_r$occ <- as.factor(sample(0:1, 50, replace = TRUE))
# Fit the GAM
mod_gam <- mgcv::gam(formula = occ ~ elevation,
data = pnts_r,
family = binomial(link = "logit"),
method = "REML")
# Predict using our training data
predict(mod_gam) # Why aren't the responses 1s and 0s?
# Predict using the entire raster
terra::predict(mod_gam, r, fun = "predict", na.rm=TRUE) # Error invalid name(s)
I understand the variable names in the model need to be identical to those in the raster, but in this case there is only 1 (“elevation”) and its identical in both.