I have a formula (Y = x1 + Ax1^2 + Bx2^2 + Cx3^2 + Constant)
, and a dataset to train the model.
The dataset includes a column for each Y, x1, x2, x3, and the Month.
I’d like the regression analysis to run 12 times, once for each Month – such that I’ll be able to see a table of 12 rows (one for each month) with coefficients and constants to reference for calculating predictions.
For an initial test (without monthly granularity), I attempted to run the following code: I was surprised when I reviewed the model.coef_ that the output was an array with 10 numbers in it, which seemed strange to me as I only defined three x variables.
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import PolynomialFeatures
from sklearn.linear_model import LinearRegression
from sklearn.metrics import r2_score
#Read dataset
dataset = pd.read_csv('dataset.csv')
#Replace negative y with zero
dataset.x1[dataset.y<0]=0
#define x and y
X = dataset[['x1', 'x2', 'x3']].values
y = dataset['y'].values
# Polynomial Features
poly = PolynomialFeatures(degree=2)
X_poly = poly.fit_transform(X)
# Linear Regression model
model = LinearRegression()
model.fit(X_poly, y)
Patrick Flume is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
1