This script is used to predict the risk of heart disease for a user based on the user's input data.
Extra Tree classifier with get-dumies encoder for categorical features and MinMaxScaler for numeric features.
import pandas as pd
pd.set_option('display.max_columns', None)
import numpy as np
import joblib
from sklearn.preprocessing import MinMaxScaler
import get_patient_data as gpd
try:
# Load normalization statistics (mean and standard deviation) for MinMaxScaler
filepath = r"heart-disease-prediction_systemmodel_filesnormalization_stats.pkl"
mean, std = joblib.load(filepath) # Unpack the tuple
print(f"nLoading normalization stats from:n{filepath}")
print("Normalization stats loaded successfully")
except FileNotFoundError:
raise ValueError("Model file or normalization statistics file not found. Please check the file paths.")
# Define the columns to reindex the encoded features
column_to_reindex = ["age", "resting_blood_pressure", "cholesterol", "fasting_blood_sugar", "max_heart_rate",
"excersise_induced_angina", "st_depression", "sex_male", "chest_pain_type_atypical angina",
"chest_pain_type_non-anginal pain", "chest_pain_type_typical angina", "resting_ecg_left ventricular hypertrophy",
"resting_ecg_normal", "st_slope_flat", "st_slope_upslopping"]
user_data = gpd.get_user_data()
print("shape of user data:", user_data.shape)
categorical_features = ["sex", "chest_pain_type", "resting_ecg", "st_slope"]
numeric_features = ["age", "resting_blood_pressure", "cholesterol", "max_heart_rate", "st_depression"]
new_user_data = pd.get_dummies(user_data, columns=categorical_features).reindex(columns=column_to_reindex, fill_value=0)
print("nEncoded Features:n", new_user_data)
print("shape of Encoded Features:", new_user_data.shape)
# Scale numeric features using MinMaxScaler
numeric_data = new_user_data[numeric_features]
scaler = MinMaxScaler()
scaler.min_, scaler.scale_ = mean, std
scaler.fit(numeric_data)
# Transform the features directly (no need to fit)
transformed_features = scaler.transform(numeric_data)
# Assign the transformed features back to the corresponding columns
new_user_data[numeric_features] = transformed_features
print("nFinal Processed User Data:n", new_user_data)
AttributeError: ‘MinMaxScaler’ object has no attribute ‘n_samples_seen_’
Firstly I start to create a model with conda as it’s most common and easy way as a beginner, then copied the saved files (extra tree model.pkl) into the prediction system folder (Vscode) folder where I got data from Firebase and upload the final prediction result into Firebase~?
New contributor
yasser Mamdouh is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.