Skip to content

Binary classification (machine failure prediction) and recall optimization for unbalanced datasets. Comparing XGB, LR, L1, JT, and more to check performance.

Notifications You must be signed in to change notification settings

santtiospina/machineFailurePrediction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

machineFailurePrediction

Binary Classification (Machine Failure Prediction) and Recall Optimization in unbalanced datasets.

Comparing scenarios and algorithms to check performance.

XGBoost, L1 Regularization, Johnson Transformation, QQplots, Statistical Distributions

Comparing RECALL results of various models:

image

Improved recall from 47.26% to 95.21% using Synthetic Minority Oversampling Technique and Extreme Gradient Boosting (XGBoost)

The low 47.26% recall was obtained using Johnson Transformations to the features and L1 regularization with alpha=0.07

Recall improvement was needed.

The Synthetic Minority Oversampling Technique (SMOTE) was necessary because of an imbalanced (98:2) dataset:

Captura de pantalla 2023-09-13 a la(s) 1 18 07 p m

With feature engineering, removing multicollinearity, and encoding categorial variables, the model using Synthetic Minority Oversampling Technique and Extreme Gradient Boosting (XGBoost) was able to improve recall in an imbalanced datasets obtaining:

Accuracy: 0.9985

Precision: 0.9542

Recall: 0.9521

F1 Score: 0.9531

image

formulas taken from: https://www.tutorialexample.com/an-introduction-to-accuracy-precision-recall-f1-score-in-machine-learning-machine-learning-tutorial/

About

Binary classification (machine failure prediction) and recall optimization for unbalanced datasets. Comparing XGB, LR, L1, JT, and more to check performance.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published