Q: In the tidyverts/fable forecasting framework, with many target time series to forecast, how do I supply a different target transform parameter for each series?
In particular, I’d like to do a Box-Cox transformation of each time series but using a different lambda for each series, e.g., the lambda estimated from the Guerrero method on each series. How do I do this within the framework?
Below are couple attempts of mine. I get errors.
If there’s not a way to do this within the framework, is there a good hack I can use? I proposed one myself. But I assume one can do better.
library(fpp3)
# construct data in transformed space directly
z1 <- arima.sim(n=104,list(ar=0.9))
z2 <- arima.sim(n=104,list(ma=0.5))
# inverse to get data in the untransformed space
y1 <- fabletools::inv_box_cox(z1, lambda=0.25)
y2 <- fabletools::inv_box_cox(z2, lambda=0.75)
# create tsibble for time series modeling
tibble(idx=1:104, y1=y1, y2=y2) %>%
pivot_longer(cols=c(y1,y2), names_to='series', values_to='value') %>%
tsibble(index=idx, key=series) ->
dat
# estimate optimal box-cox transform lambda for each series using guerrero
# method
dat %>%
fabletools::features(value, features='guerrero') ->
lambdas
# # A tibble: 2 × 2
# series lambda_guerrero
# <chr> <dbl>
# 1 y1 0.0991
# 2 y2 0.751
# set up the optimal lambdas as exogenous regressors?
dat %>% inner_join(lambdas, by=join_by(series)) -> dat.xrg
dat.xrg %>%
model(arima=ARIMA(box_cox(value,lambda=lambda))) ->
fit
# Error in `.g()`:
# ! Response variable transformation has incompatible lengths, all arguments must be the length of the data 104 or 1.
# Run `rlang::last_trace()` to see where the error occurred.
# Try defining lambda outside, and of the length desired?
lambdas %>% pull(lambda_guerrero) %>% rep(each=104) -> lambda
length(lambda)
# [1] 208
dat %>%
model(arima=ARIMA(box_cox(value, lambda=lambda))) ->
fit
# Error in `.g()`:
# ! Response variable transformation has incompatible lengths, all arguments must be the length of the data 208 or 1.
# Run `rlang::last_trace()` to see where the error occurred.
# just going with a tidy-hack
# is this the best one can do?
dat %>%
nest(.by=series) %>%
inner_join(lambdas, by = "series") %>%
mutate(
fit=map2(
data,
lambda_guerrero,
(.dat,.lambda)
model(
.dat,
arima=ARIMA(box_cox(value, lambda=.lambda))
)
)
) %>%
unnest(cols=fit) %>%
select(series, arima) %>%
as_mable(key='series', model='arima') ->
fit
# looks right
fit
# # A mable: 2 x 2
# # Key: series [2]
# series arima
# <chr> <model>
# 1 y1 <ARIMA(1,0,2)>
# 2 y2 <ARIMA(0,0,1)>
# still get access to all the nice fable tools
fit %>% accuracy()
# # A tibble: 2 × 11
# series .model .type ME RMSE MAE MPE MAPE MASE RMSSE ACF1
# <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
# 1 y1 arima Training 0.311 3.97 1.91 -450. 512. 0.956 0.961 -0.123
# 2 y2 arima Training 0.269 1.16 0.918 -884. 1003. 0.879 0.840 0.00342
# can make a nice plot
fit %>%
augment() %>%
ggplot(aes(x=idx, y=value)) +
geom_point() +
geom_line(aes(y=.fitted),color='blue') +
facet_grid(rows=vars(series))