Thiết kế website giá rẻ

Question

I am trying to add an exception feature in an ML project I am working on, I create a web app which accepts student performance data as a CSV file and then performs different machine learning algorithms and selects and saves the model with the best R2 score, it deletes any previous model if already existing and replaces it with model trained on new data, and then displays the R2 score to the user. The app is working fine with correct data, I tried to build a process to show an error message to the user if the input data is incorrect. I have this use case where in the following portion of the CSV file I deleted one of the column entries in one of the record:

<code>"gender","race_ethnicity","parental_level_of_education","lunch","test_preparation_course","math_score","reading_score","writing_score"

"female","group B","bachelor's degree","standard","none","72","72","74"

"female","group C","some college","standard","completed","69","90","88"

"female","group B","master's degree","standard","none","90","95","93"

"male","group A","associate's degree","free/reduced","none","47","57","44"

</code>

<code>"gender","race_ethnicity","parental_level_of_education","lunch","test_preparation_course","math_score","reading_score","writing_score" "female","group B","bachelor's degree","standard","none","72","72","74" "female","group C","some college","standard","completed","69","90","88" "female","group B","master's degree","standard","none","90","95","93" "male","group A","associate's degree","free/reduced","none","47","57","44" </code>

"gender","race_ethnicity","parental_level_of_education","lunch","test_preparation_course","math_score","reading_score","writing_score"
"female","group B","bachelor's degree","standard","none","72","72","74"
"female","group C","some college","standard","completed","69","90","88"
"female","group B","master's degree","standard","none","90","95","93"
"male","group A","associate's degree","free/reduced","none","47","57","44"

Here I changed the second entry, from "female","group C","some college","standard","completed","69","90","88" to "female","some college","standard","completed","69","90","88", to check how it handles the error. Actually, as I share below the log file, it shows that the program was able to create a model, maybe because I used imputer to fix missing values, and thus was able to build a model and show the R2 score in the logs. The issue is, that it is not showing the R2 score, nor any error on the webpage, instead the site stops working and shows error code 400, but in the logs it shows status code 200, and doesn’t show any error in terminal. I am sharing the screenshot of Network tab of developer options, if it may help in figuring out the issue.

Screenshot of crashed web page after submitting incorrect input file

Logs file output:

<code>[2024-07-18 09:26:07,184] _internal.py:97 _log() werkzeug - INFO - [31m[1mWARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.[0m

* Running on all addresses (0.0.0.0)

* Running on http://127.0.0.1:8080

* Running on http://172.16.5.4:8080

[2024-07-18 09:26:07,184] _internal.py:97 _log() werkzeug - INFO - [33mPress CTRL+C to quit[0m

[2024-07-18 09:26:27,148] _internal.py:97 _log() werkzeug - INFO - 127.0.0.1 - - [18/Jul/2024 09:26:27] "GET / HTTP/1.1" 200 -

[2024-07-18 09:26:35,995] _internal.py:97 _log() werkzeug - INFO - 127.0.0.1 - - [18/Jul/2024 09:26:35] "GET / HTTP/1.1" 200 -

[2024-07-18 09:26:42,111] _internal.py:97 _log() werkzeug - INFO - 127.0.0.1 - - [18/Jul/2024 09:26:42] "GET /input HTTP/1.1" 200 -

[2024-07-18 09:27:04,034] train_pipeline.py:35 delete_and_recreate_model() root - INFO - Saved new raw data as CSV file

[2024-07-18 09:27:04,034] data_ingestion.py:26 initiate_data_ingestion() root - INFO - Entered data ingestion method or component

[2024-07-18 09:27:04,041] data_ingestion.py:29 initiate_data_ingestion() root - INFO - Read dataset as df

[2024-07-18 09:27:04,047] data_ingestion.py:35 initiate_data_ingestion() root - INFO - Train test split initiating

[2024-07-18 09:27:04,068] data_ingestion.py:42 initiate_data_ingestion() root - INFO - ingestion of data completed

[2024-07-18 09:27:04,071] data_transformation.py:60 initiate_data_transformation() root - INFO - Read train and test data completed

[2024-07-18 09:27:04,074] data_transformation.py:76 initiate_data_transformation() root - INFO - numerical features are Index(['reading_score', 'writing_score'], dtype='object') and categorical features are Index(['gender', 'race_ethenicity', 'parental_level_of_education', 'lunch',

'test_preparation_course'],

dtype='object')

[2024-07-18 09:27:04,074] data_transformation.py:31 get_data_transformer() root - INFO - numerical columns scaling completed

[2024-07-18 09:27:04,074] data_transformation.py:40 get_data_transformer() root - INFO - categorical columns logging completed

[2024-07-18 09:27:04,074] data_transformation.py:81 initiate_data_transformation() root - INFO - applying preprocessing object on train and test df

[2024-07-18 09:27:04,124] model_trainer.py:32 initiate_model_trainer() root - INFO - Split training and test input data

[2024-07-18 09:27:23,565] _internal.py:97 _log() werkzeug - INFO - 127.0.0.1 - - [18/Jul/2024 09:27:23] "GET /input HTTP/1.1" 200 -

[2024-07-18 09:28:06,418] model_trainer.py:99 initiate_model_trainer() root - INFO - Best model found

[2024-07-18 09:28:06,420] train_pipeline.py:49 delete_and_recreate_model() root - INFO - New R2 score is: 0.8803008999935347

[2024-07-18 09:28:06,420] application.py:59 input_data() root - INFO - Processing completed. New R2 score: 0.8803008999935347

[2024-07-18 09:28:06,421] _internal.py:97 _log() werkzeug - INFO - 127.0.0.1 - - [18/Jul/2024 09:28:06] "POST /input HTTP/1.1" 200 -

</code>

<code>[2024-07-18 09:26:07,184] _internal.py:97 _log() werkzeug - INFO - [31m[1mWARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.[0m * Running on all addresses (0.0.0.0) * Running on http://127.0.0.1:8080 * Running on http://172.16.5.4:8080 [2024-07-18 09:26:07,184] _internal.py:97 _log() werkzeug - INFO - [33mPress CTRL+C to quit[0m [2024-07-18 09:26:27,148] _internal.py:97 _log() werkzeug - INFO - 127.0.0.1 - - [18/Jul/2024 09:26:27] "GET / HTTP/1.1" 200 - [2024-07-18 09:26:35,995] _internal.py:97 _log() werkzeug - INFO - 127.0.0.1 - - [18/Jul/2024 09:26:35] "GET / HTTP/1.1" 200 - [2024-07-18 09:26:42,111] _internal.py:97 _log() werkzeug - INFO - 127.0.0.1 - - [18/Jul/2024 09:26:42] "GET /input HTTP/1.1" 200 - [2024-07-18 09:27:04,034] train_pipeline.py:35 delete_and_recreate_model() root - INFO - Saved new raw data as CSV file [2024-07-18 09:27:04,034] data_ingestion.py:26 initiate_data_ingestion() root - INFO - Entered data ingestion method or component [2024-07-18 09:27:04,041] data_ingestion.py:29 initiate_data_ingestion() root - INFO - Read dataset as df [2024-07-18 09:27:04,047] data_ingestion.py:35 initiate_data_ingestion() root - INFO - Train test split initiating [2024-07-18 09:27:04,068] data_ingestion.py:42 initiate_data_ingestion() root - INFO - ingestion of data completed [2024-07-18 09:27:04,071] data_transformation.py:60 initiate_data_transformation() root - INFO - Read train and test data completed [2024-07-18 09:27:04,074] data_transformation.py:76 initiate_data_transformation() root - INFO - numerical features are Index(['reading_score', 'writing_score'], dtype='object') and categorical features are Index(['gender', 'race_ethenicity', 'parental_level_of_education', 'lunch', 'test_preparation_course'], dtype='object') [2024-07-18 09:27:04,074] data_transformation.py:31 get_data_transformer() root - INFO - numerical columns scaling completed [2024-07-18 09:27:04,074] data_transformation.py:40 get_data_transformer() root - INFO - categorical columns logging completed [2024-07-18 09:27:04,074] data_transformation.py:81 initiate_data_transformation() root - INFO - applying preprocessing object on train and test df [2024-07-18 09:27:04,124] model_trainer.py:32 initiate_model_trainer() root - INFO - Split training and test input data [2024-07-18 09:27:23,565] _internal.py:97 _log() werkzeug - INFO - 127.0.0.1 - - [18/Jul/2024 09:27:23] "GET /input HTTP/1.1" 200 - [2024-07-18 09:28:06,418] model_trainer.py:99 initiate_model_trainer() root - INFO - Best model found [2024-07-18 09:28:06,420] train_pipeline.py:49 delete_and_recreate_model() root - INFO - New R2 score is: 0.8803008999935347 [2024-07-18 09:28:06,420] application.py:59 input_data() root - INFO - Processing completed. New R2 score: 0.8803008999935347 [2024-07-18 09:28:06,421] _internal.py:97 _log() werkzeug - INFO - 127.0.0.1 - - [18/Jul/2024 09:28:06] "POST /input HTTP/1.1" 200 - </code>

[2024-07-18 09:26:07,184] _internal.py:97 _log() werkzeug - INFO - [31m[1mWARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.[0m
 * Running on all addresses (0.0.0.0)
 * Running on http://127.0.0.1:8080
 * Running on http://172.16.5.4:8080
[2024-07-18 09:26:07,184] _internal.py:97 _log() werkzeug - INFO - [33mPress CTRL+C to quit[0m
[2024-07-18 09:26:27,148] _internal.py:97 _log() werkzeug - INFO - 127.0.0.1 - - [18/Jul/2024 09:26:27] "GET / HTTP/1.1" 200 -
[2024-07-18 09:26:35,995] _internal.py:97 _log() werkzeug - INFO - 127.0.0.1 - - [18/Jul/2024 09:26:35] "GET / HTTP/1.1" 200 -
[2024-07-18 09:26:42,111] _internal.py:97 _log() werkzeug - INFO - 127.0.0.1 - - [18/Jul/2024 09:26:42] "GET /input HTTP/1.1" 200 -
[2024-07-18 09:27:04,034] train_pipeline.py:35 delete_and_recreate_model() root - INFO - Saved new raw data as CSV file
[2024-07-18 09:27:04,034] data_ingestion.py:26 initiate_data_ingestion() root - INFO - Entered data ingestion method or component
[2024-07-18 09:27:04,041] data_ingestion.py:29 initiate_data_ingestion() root - INFO - Read dataset as df
[2024-07-18 09:27:04,047] data_ingestion.py:35 initiate_data_ingestion() root - INFO - Train test split initiating
[2024-07-18 09:27:04,068] data_ingestion.py:42 initiate_data_ingestion() root - INFO - ingestion of data completed
[2024-07-18 09:27:04,071] data_transformation.py:60 initiate_data_transformation() root - INFO - Read train and test data completed
[2024-07-18 09:27:04,074] data_transformation.py:76 initiate_data_transformation() root - INFO - numerical features are Index(['reading_score', 'writing_score'], dtype='object') and categorical features are Index(['gender', 'race_ethenicity', 'parental_level_of_education', 'lunch',
       'test_preparation_course'],
      dtype='object')
[2024-07-18 09:27:04,074] data_transformation.py:31 get_data_transformer() root - INFO - numerical columns scaling completed
[2024-07-18 09:27:04,074] data_transformation.py:40 get_data_transformer() root - INFO - categorical columns logging completed
[2024-07-18 09:27:04,074] data_transformation.py:81 initiate_data_transformation() root - INFO - applying preprocessing object on train and test df
[2024-07-18 09:27:04,124] model_trainer.py:32 initiate_model_trainer() root - INFO - Split training and test input data
[2024-07-18 09:27:23,565] _internal.py:97 _log() werkzeug - INFO - 127.0.0.1 - - [18/Jul/2024 09:27:23] "GET /input HTTP/1.1" 200 -
[2024-07-18 09:28:06,418] model_trainer.py:99 initiate_model_trainer() root - INFO - Best model found
[2024-07-18 09:28:06,420] train_pipeline.py:49 delete_and_recreate_model() root - INFO - New R2 score is: 0.8803008999935347
[2024-07-18 09:28:06,420] application.py:59 input_data() root - INFO - Processing completed. New R2 score: 0.8803008999935347
[2024-07-18 09:28:06,421] _internal.py:97 _log() werkzeug - INFO - 127.0.0.1 - - [18/Jul/2024 09:28:06] "POST /input HTTP/1.1" 200 -

My application.py code:

<code>from flask import Flask,request,render_template

import numpy as np

import pandas as pd

from src.exception import CustomException

import sys

from src.logger import logging

from sklearn.preprocessing import StandardScaler

from src.pipeline.predict_pipeline import CustomData,PredictPipeline

from src.pipeline.train_pipeline import RetrainWithNewData

application=Flask(__name__)

app=application

@app.route('/input', methods=['GET', 'POST'])

def input_data():

if request.method == 'GET':

return render_template('home2.html')

else:

try:

file = request.files['file']

# Check if file is present

if file.filename == '':

return render_template('home2.html', error="No file selected")

# Check file extension (assuming you want CSV files)

if not file.filename.lower().endswith('.csv'):

return render_template('home2.html', error="Invalid file type. Please upload a CSV file.")

retrainPipeline = RetrainWithNewData(file)

new_r2_score = retrainPipeline.delete_and_recreate_model()

logging.info(f"Processing completed. New R2 score: {new_r2_score}")

return render_template('home2.html', new_r2_score=new_r2_score)

except Exception as e:

# For unexpected exceptions, you might want to log them and show a generic message

logging.error(f"Unexpected error: {str(e)}")

return render_template('home2.html', error=str(e))

if __name__=="__main__":

app.run(host="0.0.0.0",port=8080)

</code>

<code>from flask import Flask,request,render_template import numpy as np import pandas as pd from src.exception import CustomException import sys from src.logger import logging from sklearn.preprocessing import StandardScaler from src.pipeline.predict_pipeline import CustomData,PredictPipeline from src.pipeline.train_pipeline import RetrainWithNewData application=Flask(__name__) app=application @app.route('/input', methods=['GET', 'POST']) def input_data(): if request.method == 'GET': return render_template('home2.html') else: try: file = request.files['file'] # Check if file is present if file.filename == '': return render_template('home2.html', error="No file selected") # Check file extension (assuming you want CSV files) if not file.filename.lower().endswith('.csv'): return render_template('home2.html', error="Invalid file type. Please upload a CSV file.") retrainPipeline = RetrainWithNewData(file) new_r2_score = retrainPipeline.delete_and_recreate_model() logging.info(f"Processing completed. New R2 score: {new_r2_score}") return render_template('home2.html', new_r2_score=new_r2_score) except Exception as e: # For unexpected exceptions, you might want to log them and show a generic message logging.error(f"Unexpected error: {str(e)}") return render_template('home2.html', error=str(e)) if __name__=="__main__": app.run(host="0.0.0.0",port=8080) </code>

from flask import Flask,request,render_template
import numpy as np
import pandas as pd
from src.exception import CustomException
import sys
from src.logger import logging
from sklearn.preprocessing import StandardScaler
from src.pipeline.predict_pipeline import CustomData,PredictPipeline
from src.pipeline.train_pipeline import RetrainWithNewData

application=Flask(__name__)
app=application

@app.route('/input', methods=['GET', 'POST'])
def input_data():
    if request.method == 'GET':
        return render_template('home2.html')
    else:
        try:
            file = request.files['file']
            
            # Check if file is present
            if file.filename == '':
                return render_template('home2.html', error="No file selected")
            
            # Check file extension (assuming you want CSV files)
            if not file.filename.lower().endswith('.csv'):
                return render_template('home2.html', error="Invalid file type. Please upload a CSV file.")
            
            retrainPipeline = RetrainWithNewData(file)
            new_r2_score = retrainPipeline.delete_and_recreate_model()

            logging.info(f"Processing completed. New R2 score: {new_r2_score}")

            return render_template('home2.html', new_r2_score=new_r2_score)

        except Exception as e:
            # For unexpected exceptions, you might want to log them and show a generic message
            logging.error(f"Unexpected error: {str(e)}")
            return render_template('home2.html', error=str(e))
if __name__=="__main__":
    app.run(host="0.0.0.0",port=8080)

My home2.html code:

<body>

<h2>Upload Data</h2>

</form>

{% if error %}

<h2 style="color: red;">Error: {{ error }}</h2>

{% endif %}

{% if new_r2_score %}

<h2>The new R2 score is {{ new_r2_score }}</h2>

{% endif %}

</body>

</html>

</code>

<code><html> <body> <form action="{{ url_for('input_data')}}" method="post" enctype="multipart/form-data"> <h2>Upload Data</h2> <input type="file" id="file" name="file" required> <br><br> <input type="submit" name="upload_submit" value="Upload data in CSV format"> </form> {% if error %} <h2 style="color: red;">Error: {{ error }}</h2> {% endif %} {% if new_r2_score %} <h2>The new R2 score is {{ new_r2_score }}</h2> {% endif %} </body> </html> </code>

<html>
<body>
    <form action="{{ url_for('input_data')}}" method="post" enctype="multipart/form-data">
        <h2>Upload Data</h2>
        <input type="file" id="file" name="file" required>
        <br><br>
        <input type="submit" name="upload_submit" value="Upload data in CSV format">
    </form>

    {% if error %}
    <h2 style="color: red;">Error: {{ error }}</h2>
    {% endif %}

    {% if new_r2_score %}
    <h2>The new R2 score is {{ new_r2_score }}</h2>
    {% endif %}
</body>
</html>

My train_pipeline.py code:

<code>import sys

import os

import shutil

from src.exception import CustomException

from src.logger import logging

from src.components.data_ingestion import DataIngestion, DataIngestionConfig

from src.components.data_transformation import DataTransformation, DataTransformationConfig

from src.components.model_trainer import ModelTrainer, ModelTrainerConfig

class RetrainWithNewData():

def __init__(self, file):

self.file = file

def delete_and_recreate_model(self):

try:

new_data = self.file

new_raw_data_path = os.path.join(os.getcwd(), "notebook/data")

artifacts_path = os.path.join(os.getcwd(), "artifacts")

# Remove existing directories if they exist

if os.path.exists(new_raw_data_path):

shutil.rmtree(new_raw_data_path)

if os.path.exists(artifacts_path):

shutil.rmtree(artifacts_path)

# Create necessary directories

os.makedirs(new_raw_data_path, exist_ok=True)

os.makedirs(artifacts_path, exist_ok=True)

# Save the new data to a specific file path

new_raw_data_file_path = os.path.join(new_raw_data_path, "stud.csv")

new_data.save(new_raw_data_file_path)

logging.info("Saved new raw data as CSV file")

# Start the data ingestion process

obj = DataIngestion()

train_data, test_data = obj.initiate_data_ingestion()

# Transform the data

data_transformation = DataTransformation()

train_arr, test_arr, _ = data_transformation.initiate_data_transformation(train_data, test_data)

# Train the model

modelTrainer = ModelTrainer()

new_r2_score = float(modelTrainer.initiate_model_trainer(train_arr, test_arr))

logging.info(f"New R2 score is: {new_r2_score}")

return new_r2_score

except Exception as e:

raise CustomException(e, sys)

</code>

<code>import sys import os import shutil from src.exception import CustomException from src.logger import logging from src.components.data_ingestion import DataIngestion, DataIngestionConfig from src.components.data_transformation import DataTransformation, DataTransformationConfig from src.components.model_trainer import ModelTrainer, ModelTrainerConfig class RetrainWithNewData(): def __init__(self, file): self.file = file def delete_and_recreate_model(self): try: new_data = self.file new_raw_data_path = os.path.join(os.getcwd(), "notebook/data") artifacts_path = os.path.join(os.getcwd(), "artifacts") # Remove existing directories if they exist if os.path.exists(new_raw_data_path): shutil.rmtree(new_raw_data_path) if os.path.exists(artifacts_path): shutil.rmtree(artifacts_path) # Create necessary directories os.makedirs(new_raw_data_path, exist_ok=True) os.makedirs(artifacts_path, exist_ok=True) # Save the new data to a specific file path new_raw_data_file_path = os.path.join(new_raw_data_path, "stud.csv") new_data.save(new_raw_data_file_path) logging.info("Saved new raw data as CSV file") # Start the data ingestion process obj = DataIngestion() train_data, test_data = obj.initiate_data_ingestion() # Transform the data data_transformation = DataTransformation() train_arr, test_arr, _ = data_transformation.initiate_data_transformation(train_data, test_data) # Train the model modelTrainer = ModelTrainer() new_r2_score = float(modelTrainer.initiate_model_trainer(train_arr, test_arr)) logging.info(f"New R2 score is: {new_r2_score}") return new_r2_score except Exception as e: raise CustomException(e, sys) </code>

import sys
import os
import shutil
from src.exception import CustomException
from src.logger import logging
from src.components.data_ingestion import DataIngestion, DataIngestionConfig
from src.components.data_transformation import DataTransformation, DataTransformationConfig
from src.components.model_trainer import ModelTrainer, ModelTrainerConfig

class RetrainWithNewData():
    def __init__(self, file):
        self.file = file
    
    def delete_and_recreate_model(self):
        try:
            new_data = self.file
            
            new_raw_data_path = os.path.join(os.getcwd(), "notebook/data")
            artifacts_path = os.path.join(os.getcwd(), "artifacts")
            
            # Remove existing directories if they exist
            if os.path.exists(new_raw_data_path):
                shutil.rmtree(new_raw_data_path)
            
            if os.path.exists(artifacts_path):
                shutil.rmtree(artifacts_path)
            
            # Create necessary directories
            os.makedirs(new_raw_data_path, exist_ok=True)
            os.makedirs(artifacts_path, exist_ok=True)
            
            # Save the new data to a specific file path
            new_raw_data_file_path = os.path.join(new_raw_data_path, "stud.csv")
            new_data.save(new_raw_data_file_path)
            logging.info("Saved new raw data as CSV file")
            
            # Start the data ingestion process
            obj = DataIngestion()
            train_data, test_data = obj.initiate_data_ingestion()
            
            # Transform the data
            data_transformation = DataTransformation()
            train_arr, test_arr, _ = data_transformation.initiate_data_transformation(train_data, test_data)
            
            # Train the model
            modelTrainer = ModelTrainer()
            new_r2_score = float(modelTrainer.initiate_model_trainer(train_arr, test_arr))
            
            logging.info(f"New R2 score is: {new_r2_score}")

            return new_r2_score
        
        except Exception as e:
            raise CustomException(e, sys)

I would be really grateful for your help.

I expected an output R2 score to be displayed, or an error message to be displayed to the user about incorrect file input.

Thiết kế website giá rẻ

Danh mục

Stuck in handling incorrect input data on web app for model training