I’m attempting to incorporate various transformations into a scikit-learn pipeline along with a LightGBM model. This model aims to predict the prices of second-hand vehicles. Once trained, I plan to integrate this model into an HTML page for practical use.
from sklearn.preprocessing import StandardScaler, LabelEncoder
from sklearn.pipeline import Pipeline
from sklearn.compose import ColumnTransformer
import joblib
print(numeric_features)
`['car_year', 'km', 'horse_power', 'cyl_capacity']`
print(categorical_features)
`['make', 'model', 'trimlevel', 'fueltype', 'transmission', 'bodytype', 'color']`
# Define transformers for numeric and categorical features
numeric_transformer = Pipeline(steps=[('scaler', StandardScaler())])
categorical_transformer = Pipeline(steps=[('labelencoder', LabelEncoder())])
# Combine transformers using ColumnTransformer
preprocessor = ColumnTransformer(
transformers=[
('num', numeric_transformer, numeric_features),
('cat', categorical_transformer, categorical_features)
]
)
# Append the LightGBM model to the preprocessing pipeline
pipeline = Pipeline(steps=[
('preprocessor', preprocessor),
('model', best_lgb_model)
])
# Fit the pipeline to training data
pipeline.fit(X_train, y_train)
The output I get when training is:
LabelEncoder.fit_transform() takes 2 positional arguments but 3 were given