I am working with a dataset that used a BACI design in R. The data was collected in 4 sub-watersheds (Subwatershed) and 5 hillslope positions that were nested in each sub-watershed (Position). I have created daily averages (WC_15cm_daily_avg) for each Date value (yyyy-mm-dd). I also have a column called Period (Cal or Trt) and one called Treatment (0 or 1) to indicate which period it was (Calibration or Treatment) and which treatment it was (Control or Thinned). I want to see if the harvest had an effect, which would show up in a significant interaction between Treatment and Period.
Because of repeated measures (the data was collected from 2016 to 2021 with daily values), there is autocorrelation present. I want to use a generalized linear mixed model to take this autocorrelation and random effects of Sub-watershed and Position into account. WC_15cm_daily_avg ranges from 0.05 to 0.4, so I am using a beta distribution. The issue is that when I use ar for the autocorrelation I end up with overdispersion. However, if I don’t include anything for autocorrelation then the data is still heavily autocorrelated. Any thoughts? I am open to not using a generalized linear mixed model, but I need to be able to test if harvest had an effect (Period * Treatment) and I need to consider repeated measurements and autocorrelation. Here is the model and below is a snapshot of the data.
Create a Time column as a factor with explicit levels
data_subset$Time <- factor(1:nrow(data_subset), levels = 1:nrow(data_subset))
# Fit mixed-effects model with AR(1) correlation structure
model_ar1 <- glmmTMB(WC_15cm_daily_avg ~ Treatment * Period +
( 1 | Subwatershed/Position) +
ar1(Time + 0 | ID),
family = beta_family(link = "logit"),
data = data_subset)
# Summarize the model
summary(model_ar1)
Family: beta ( logit )
Formula: WC_15cm_daily_avg ~ Treatment * Period + (1 | Subwatershed/Position) +
ar1(Time + 0 | ID)
Data: data_subset
AIC BIC logLik deviance df.resid
-44413.4 -44351.9 22215.7 -44431.4 6920
Random effects:
Conditional model:
Groups Name Variance Std.Dev. Corr
Position:Subwatershed (Intercept) 2.503e-09 5.003e-05
Subwatershed (Intercept) 6.031e-08 2.456e-04
ID Time1 2.756e-01 5.250e-01 0.99 (ar1)
Number of obs: 6929, groups: Position:Subwatershed, 15; Subwatershed, 3; ID, 15
Dispersion parameter for beta family (): 3.28e+04
Conditional model:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.50792 0.13357 -11.290 <2e-16 ***
Treatment -0.02583 0.18376 -0.141 0.888
PeriodTrt 0.63821 0.02143 29.783 <2e-16 ***
Treatment:PeriodTrt 0.03444 0.02919 1.180 0.238
'data.frame': 6929 obs. of 9 variables:
$ ID : chr "TRE1" "TRE1" "TRE1" "TRE1" ...
$ Date : Date, format: "2016-06-01" "2016-06-02" "2016-06-03" ...
$ WC_15cm_daily_avg: num 0.323 0.321 0.32 0.32 0.319 ...
$ Year : int 2016 2016 2016 2016 2016 2016 2016 2016 2016 2016 ...
$ Position : chr "1" "1" "1" "1" ...
$ Subwatershed : chr "TRE" "TRE" "TRE" "TRE" ...
$ Treatment : num 1 1 1 1 1 1 1 1 1 1 ...
$ Period : chr "Cal" "Cal" "Cal" "Cal" ...
$ Time : Factor w/ 6929 levels "1","2","3","4",..: 1 2 3 4 5 6 7 8 9 10 ...
- attr(*, "na.action")= 'omit' Named int [1:1164] 292 293 294 295 296 297 311 312 313 314 ...
..- attr(*, “names”)= chr [1:1164] “292” “293” “294” “295” …
Elise Miller is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.