Relative Content

Tag Archive for machine-learningscikit-learndata-sciencerandom-forestxgboost

Final Predictions accuracy of my ML Binary Classification Model is horrible

So I am competing in a Kaggle competiton (https://www.kaggle.com/competitions/playground-series-s4e8) where we have to predict whether a mushroom is poisonous or not based on the data provided.
The issue I am facing is that my models perform well inside the training and validation sets just fine (around 98-99% accuracy) but they fall apart when I actually submit the final predictions for the competition.
The best accuracy I got until now using the Random forest model was 52% and the rest of my submissions had substantially worse performances. Since the models are performing well inside the notebooks and data with labels,
I assumed that the issue is with the way I am handling data in general because I did not implement techniques like feature engineering and I am not sure if the way I converted categorical data to numeric data works fine or not.
And as mentioned before, I am using the Random Forest Model and/or XGBoost model and these two models are quite well known to be a lot less prone to overfitting than other models.
I also ran multiple iterations of multiple models to find the models with the best parameters (as evident from the code below) so that makes the problem of overfitting less likely.

Thiết kế website giá rẻ

Danh mục

Relative Content

Tag Archive for machine-learningscikit-learndata-sciencerandom-forestxgboost

Final Predictions accuracy of my ML Binary Classification Model is horrible