Breast Cancer Predictor: Project Overview

Built a model that accepts cell nucleus values features of a breast cancer tumor as input and predicts if the cancer is Benign or Malignant.
Model is trained on a dataset of 570 Breast Cancer Images from the Kaggle Wisconsin UCI Breast Cancer dataset.
Data was trained on 5 different models. K-fold cross-validation was performed to validate for overfitting and a final trained Support Vector Machine (SVM) model was used to build the predictor.

Code and Resources used

For Web Framework Requirements: pip install -r requirements.txt

Data Cleaning

Following changes were made to the data to make it usable for a model:

Column with Null Values was removed.
Got the count of malignant vs benign tumor cells.
Performed encoding to to represent categorical variables as numerical values to use it in the ML model.

EDA

Various analysis was made related to the dataset and the models. Below are a few highlights.

Model Building

StandardScaler method was used to remove the mean and scale each feature/variable to unit variance. The data was split into train and test sets with a test size of 20%.
Five different models were tried and evaluated based on their metrics:

Logistic Regression
K-Nearest Neighbor
Decision Tree Classifier
Random Forest Method
Support Vector Machines

Model performance

The SVM model outperformed the other approaches on the test and validation sets.

Random Forest : Accuracy = 94.73%
Decision Tree : Accuracy = 93.85%
Logistic Regression: Accuracy = 96.49%
K-NN: Accuracy = 95.61%
SVM: Accuracy = 98.24%

Deployment

A Final Trained model was built on SVM where the input of the nucleus features are accepted from the user and the model predicts if the tumor is malignant or benign. The Final model can be downloaded from svm_model.pkl

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
Images		Images
BREAST_CANCER_PREDICTION.ipynb		BREAST_CANCER_PREDICTION.ipynb
README.md		README.md
svm_model.pkl		svm_model.pkl

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Breast Cancer Predictor: Project Overview

Code and Resources used

Data Cleaning

EDA

Model Building

Model performance

Deployment

About

Releases

Packages

Languages

sughoshdeshpande7/Breast_Cancer_Prediction

Folders and files

Latest commit

History

Repository files navigation

Breast Cancer Predictor: Project Overview

Code and Resources used

Data Cleaning

EDA

Model Building

Model performance

Deployment

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages