I have a training set consisting of 39 compounds. Here is a short code to calculate LOO q2 to SVR:
from sklearn.model_selection import GridSearchCV
from sklearn.metrics import make_scorer, r2_score
from sklearn.svm import SVR
clf_svr = SVR(C=1.0, gamma='scale', epsilon=0.1)
parameters = {'kernel':['linear'],
'gamma':['scale', 'auto'],
'C':[300000],
"epsilon":[0.001]}
grid_search_cv_clf = GridSearchCV(clf_svr, parameters, cv=len(X_train), n_jobs=-1)
grid_search_cv_clf.fit(X_train, y_train)
y_train_pred = grid_search_cv_clf.predict(X_train)
train_r2_LOO = r2_score(y_train, y_train_pred)
print("Train q^2 LOO:", train_r2_LOO)
but it gives a warning: C:Usersplatoanaconda3libsite-packagessklearnmodel_selection_search.py:952: UserWarning: One or more of the test scores are non-finite: [nan nan]
warnings.warn(
Then I have Train q^2 LOO: 0.8450423173708722
But regular R^2 with cv=5 output R^2: 0.8450423173708722
Please, help me to calculate q2 value for leave-one-out cross-validation using scikit-learn! I need to calculate q2 for SVR and RFR!
I’ve tried to replacing cv=len(X_train) to cv=LeaveOneOut but it still doesn’t work. Then I found another code
# Initialize lists to store predicted and true values
predicted_values = []
true_values = []
# Perform Leave One Out cross validation
for i in range(len(X)):
X_train = np.delete(X, i, axis=0)
y_train = np.delete(y, i)
X_test = X[i].reshape(1, -1)
y_test = y[i]
# Create and fit SVR model
svr = SVR()
svr.fit(X_train, y_train)
# Make prediction
y_pred = svr.predict(X_test)
# Store predicted and true values
predicted_values.append(y_pred[0])
true_values.append(y_test)
# Calculate R2
r2 = r2_score(true_values, predicted_values)
print("R2 score:", r2)
# Calculate Q2
mean_y = np.mean(true_values)
ss_tot = np.sum((true_values - mean_y) ** 2)
ss_res = np.sum((true_values - predicted_values) ** 2)
q2 = 1 - (ss_res / ss_tot)
print("Q2 score:", q2)
Платон Чеботаев is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.