I am trying to train a XGBRegressor, but the code is taking it’s taking way too long to execute. Is there any mistake i’m doing. Code is as follows:
%%time
!pip install xgboost
import xgboost as xgb
test = dd.read_csv('/kaggle/input/leap-atmospheric-physics-ai-climsim/test.csv')
#dd refers to dask.dataframe
sample_id = test['sample_id']
test = test.drop('sample_id', axis = 1)
X = df_train.drop(targets + ['sample_id'], axis = 1)
prediction = {}
submission = {}
submission['sample_id'] = sample_id
for i, target in enumerate(targets):
y = df_train[target]
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state = 42, test_size = 0.33)
dtr = xgb.XGBRegressor(verbose = False)
dtr.fit(X_train, y_train)
y_hat = dtr.predict(X_test)
submission[target] = dtr.predict(test)
prediction[target] = y_hat
print(f'r2_score for {target} : {r2_score(y_hat, y_test)}')
The training data is huge, but even with that logging should have worked but I don’t see anything in output.
New contributor
rachit_ is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.