Hi guys I’m on my way with ISLP and I’m learning alot.
I’m doing an exercise (cap 3, 9, i) that asks to see if there is a relation between the predictors and the model using anova_lm.
I don’ t get the point to do such thing instead of an F and anyway with a full model one would get the point. Anyway I’m wondering if I could loop through the columns and make for each loop a comparison between the null model and the fitted model with one predictor.
import numpy as np
import pandas as pd
import statsmodels.api as sm
from ISLP.models import ModelSpec as MS
from statsmodels.stats.anova import anova_lm
df = pd.DataFrame('df.csv', na_values=['?']) # na_values= change any '?' in nan
y= df['response']```
So I made the null model (only the intercept) and fitted the model.
Xinter = pd.DataFrame({‘intercept’: np.ones(397, dtype= ‘float’)}) # intercept matrix (df has 397 rows)
fit_inter = sm.OLS(y , Xinter) # create the model
ris_inter = fit_inter.fit() # fit the model“`
Now, I have a subset of column of my df called ‘df_colonne_rimaste’.
df_colonne_rimaste = df.columns.drop(['response']))
So since MS() wants the name of the columns that I thought that I could iterate on that object and I would be ok.
for i in df_colonne_rimaste:
print(i) # see the column I'm working with
X = MS(i).fit_transform(df) # construct the model matrix
modelx = sm.OLS(y, X, missing='drop') # specify the model
resultx = modelx.fit() # estimate parameters
print('Confronto tra modello nullo e modello con ', i, 'risulta: n {}').format(anova_lm(ris_inter, resultx))```
I get this meaage:
`IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices`
I tried other stuff I don't remember anymore I tried hard.
Were is the problem?
Thanks alot
GT87 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.