This project aims to predict the presence of heart disease in patients using machine learning techniques, particularly ensemble methods. The dataset used for training and testing contains various features such as age, sex, chest pain type, cholesterol levels, and other medical parameters. By analyzing these features, the goal is to build an accurate prediction model that can assist healthcare professionals in diagnosing heart disease.
Heart disease is one of the leading causes of death worldwide. Early detection and accurate diagnosis play a crucial role in preventing and managing heart-related conditions. Machine learning techniques offer promising solutions for predicting heart disease based on patient data.
The dataset used in this project is sourced from kaggle and IEEE DataPort.
The project follows these main steps:
- Data Preprocessing: Cleaning and preparing the dataset for model training.
- Exploratory Data Analysis (EDA): Analyzing the distribution and relationships between different features.
- Model Selection: Experimenting with different ensemble methods such as Random Forest, Gradient Boosting, and Stacking.
- Model Evaluation: Assessing model performance using metrics such as accuracy, precision, recall, and F1-score.
The flowchart of the algorithm is described bellow:
- Python 3.x
- NumPy
- Pandas
- Matplotlib
- Seaborn
- Scikit-learn
You can install the dependencies using the following command:
pip install -r requirements.txt