I am trying to predict a short time, stationary series (the length of the series is 40 values). So far, I have been able to do this using statistical methods, the results of which are relatively satisfactory to me. In order to explore other forecasting methods, I am considering the Boosting Regression (GradientBoostingRegressor, XGBRegressor,) and SVR classes. Forecasts cannot be built, the models give one value for all x_train. I can’t figure out what I’m doing wrong. I doubt that I am preparing the data for the models correctly.
Here is my algorithm of actions: 1. to get rid of the trend, I calculate the growth rate (do I need to get rid of the trend at all?) y is the resulting rate values 2. I fix the X – set from 1 to length y 3. divide the data into test and training samples (ratio 0.8) and then give the models such data.
from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.ensemble import GradientBoostingRegressor
from sklearn.metrics import mean_squared_error
import matplotlib.pyplot as plt
# Generating synthetic data
cagrs = [(gdp_countries.loc[country].values[i+1] - gdp_countries.loc[country].values[i]) / gdp_countries.loc[country].values[i] for i in range(len(gdp_countries.loc[country].values)-1)]
X, y = [[x] for x in range(1, len(cagrs)+1)], cagrs
# Splitting the data
size = round(len(y)*0.8)
X_train, X_test, y_train, y_test = X[:size], X[size:], y[:size], y[size:]
# Initializing the model
gbr = GradientBoostingRegressor(n_estimators=10, learning_rate=0.1, max_depth=3)
# Training the model
gbr.fit(X_train, y_train)
# Making predictions
y_pred = gbr.predict(X_test)