I am building a Logistic Regression using statsmodels (statsmodels.api) and would like to understand how to get predictions for the test dataset. This is what I have so far:
x_train_data, x_test_data, y_train_data, y_test_data = train_test_split(X, df[target_var], test_size=0.3)
logit = sm.Logit(
y_train_data,
x_train_data
)
result = logit.fit()
result.summary()
What is the best way to print the predictions for y_train_data and y_test_data for below? I am unsure which Regression metrics to use or to import in this case:
in_sample_pred = result.predict(x_train_data)
out_sample_pred = result.predict(x_test_data)
Also, what’s the best way to calculate ROC AUC score and plot it for this Logistic Regression model (through scikit-learn package)?
Thanks