The detailed analysis of credit card fraudulent data detection system.
Dataset: Anonymized credit card transactions labeled as fraudulent or genuine Download
Algorithm Used:
- Simple Logistic regression
- Logistic Regression with Undersampling
- Synthetic Minority Over Sampling Technique (SMOTE) with gini as the critertionn of best split.
- Synthetic Minority Over Sampling Technique (SMOTE) with entropy as the critertionn of best split.
Brief Summary of performance.
No. | Model | Precision | Recall | Accuracy | Conclusion |
---|---|---|---|---|---|
1 | Logistic Regression | 0.73270 | 0.62398 | 0.99896 | Rejected due to low recall. |
2 | Undersampling & Logistic Regression | 0.072 | 0.87 | 0.98 | Rejected due to very low precision. |
3 | SMOTE with Gini criteria for best split | 0.95528 | 0.95528 | 0.99985 | Best model so far. But next model provides better results. Hence, rejected. |
4 | SMOTE with entropy criteria for best split | 0.96177 | 0.97154 | 0.99988 | Accepted. |