Skip to content

vbhsharma7/Breast_Cancer_Detection_Model_deployment

Repository files navigation

Breast Cancer Detection

image

Problem Statement

The aim of the breast cancer detection dataset is to build a machine learning model that can accurately classify breast cancer cases as either benign or malignant based on various features such as clump thickness, uniformity of cell size, uniformity of cell shape, marginal adhesion, single epithelial cell size, bare nuclei, bland chromatin, normal nucleoli, and mitoses. The goal is to create a model that can assist doctors in diagnosing breast cancer and providing appropriate treatment to patients. The dataset consists of samples from fine needle aspirates of breast masses, and the model's performance will be evaluated based on its accuracy, precision, recall, and F1-score.

Objectives:

  • Conduct Exploratory Data Analysis.
  • Try understanding what type of features are used to detect breast cancer.
  • create different models and check performance of models.
  • Take input features from person and detect the person has breast cancer or not.

Methods used:

  • Data Visualization.
  • Machine Learning.

Libraries utilized:

  • NumPy and Pandas - For dataset cleaning and analysis.
  • Matplotlib, Plotly and Seaborn - For Data Visualization.
  • SkLearn - For scaling and performance measure.

Dataset used:

The Breast Cancer Wisconsin (Diagnostic) dataset is a publicly available dataset that consists of 569 instances, where each instance represents a digitized image of a breast mass. The dataset has 30 numeric, predictive attributes and the class. The predictive attributes are computed from the cell nuclei characteristics present in the images, such as radius, texture, perimeter, area, smoothness, compactness, concavity, concave points, symmetry, and fractal dimension. The class attribute is the diagnosis of the breast mass and can take two values, WDBC-Malignant or WDBC-Benign.

The dataset has a total of 569 instances, each with 30 attributes. Out of these, 212 instances are labeled as malignant and 357 instances as benign. The dataset has been preprocessed to remove any missing values and normalize the feature values to a range between 0 and 1.

It was obtained from the UCI Machine Learning Repository and was donated by Dr. William H. Wolberg, W. Nick Street, and Olvi L. Mangasarian. The dataset is widely used in the field of machine learning, particularly for classification tasks.

IMG_20230502_135721

Project Overview:

Breast cancer detection is a critical task that can save lives. In this project, we will be using machine learning algorithms to classify breast cancer is detected or not based on certain features.

To start, we will gather data related to breast cancer patients, including information such as the size of the tumor, its location, and other attributes that can affect the diagnosis. This data will be used to train and test our machine learning models.

After preprocessing the data, we will use data visualization techniques such as scatter plots and heatmaps to explore the relationship between the features and the cancer diagnosis. This will give us a better understanding of the data and help us identify any trends or patterns.

Next, we will train and test our machine learning models using techniques such as SVM and XGBoost . We will evaluate the performance of each model using metrics such as accuracy, precision, and recall. We will also create a confusion matrix to visualize the model's performance.

Finally, we will deploy the best-performing model on a local host using Flask, which is a popular web development framework in Python. This will allow us to create a web interface that can accept input from the user and provide a prediction on whether the cancer is detected or not.

Overall, this project will provide a practical introduction to machine learning and data visualization techniques while also addressing a critical real-world problem.

IMG_20230502_143538

CREDITS:

Vibhu Sharma | Avid Learner | Data Scientist | Machine Learning Engineer | Deep Learning enthusiast

Contact me for Data Science Project Collaborations

LinkedIn Badge GitHub Badge Medium Badge

About

Breast Cancer Detection in ML with Web End Deployment using Flask

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages