Name		Name	Last commit message	Last commit date
parent directory ..
.DS_Store		.DS_Store
HealthData.csv		HealthData.csv
KNN.py		KNN.py
KNN_model.pkl		KNN_model.pkl
Log_ROC.png		Log_ROC.png
Logistic_regression_model.pkl		Logistic_regression_model.pkl
README.md		README.md
cleve.csv		cleve.csv
decision_tree.py		decision_tree.py
decision_tree_model.pkl		decision_tree_model.pkl
deep_learning.py		deep_learning.py
deep_learning_model.pkl		deep_learning_model.pkl
logistic_regression.py		logistic_regression.py
naive_bayes.py		naive_bayes.py
naive_bayes_model.pkl		naive_bayes_model.pkl
newdata.csv		newdata.csv
standard_scalar.pkl		standard_scalar.pkl
svc.py		svc.py
svc_model.pkl		svc_model.pkl

README.md

Heart-Disease-Prediction

Cardiovascular diseases are the leading cause of death globally, resulted in 17.9 million deaths (32.1%) in 2015, up from 12.3 million (25.8%) in 1990. It is estimated that 90% of CVD is preventable. There are many risk factors for heart diseases that we will take a closer look at.

The main objective of this study is to build a model that can predict the heart disease occurrence, based on a combination of features (risk factors) describing the disease. Different machine learning classification techniques will be implemented and compared upon standard performance metric such as accuracy.

The dataset used for this study was taken from UCI machine learning repository

Description:

This data set currently contains 303 instances, some of which aren't complete (some features may be missing for a certain instance). In the case that this happens, the instance has been removed. There are 14 relevant features which have been extracted, from a maximum of 76 in the total dataset.

Features information:

1.age - age in years

2.sex - sex(1 = male; 0 = female)

3.chest_pain - chest pain type (1 = typical angina; 2 = atypical angina; 3 = non-anginal pain; 4 = asymptomatic)

4.blood_pressure - resting blood pressure (in mm Hg on admission to the hospital)

5.serum_cholestoral - serum cholestoral in mg/dl

6.fasting_blood_sugar - fasting blood sugar > 120 mg/dl (1 = true; 0 = false)

7.electrocardiographic - resting electrocardiographic results (0 = normal; 1 = having ST-T; 2 = hypertrophy)

8.max_heart_rate - maximum heart rate achieved

9.induced_angina - exercise induced angina (1 = yes; 0 = no)

10.ST_depression - ST depression induced by exercise relative to rest

11.slope - the slope of the peak exercise ST segment (1 = upsloping; 2 = flat; 3 = downsloping)

12.no_of_vessels - number of major vessels (0-3) colored by flourosopy

13.thal - 3 = normal; 6 = fixed defect; 7 = reversable defect

14.num:diagnosis - the predicted attribute - diagnosis of heart disease (angiographic disease status) (Value 0 = < 50% diameter narrowing; Value 1 = > 50% diameter narrowing)

Types of features:

A. Categorical features (Has two or more categories and each value in that feature can be categorised by them): sex, chest_pain

B. Ordinal features (Variable having relative ordering or sorting between the values): fasting_blood_sugar, electrocardiographic, induced_angina, slope, no_of_vessels, thal, diagnosis

C. Continuous features (Variable taking values between any two points or between the minimum or maximum values in the feature column): age, blood_pressure, serum_cholestoral, max_heart_rate, ST_depression

The main goal is to predict heart disease occurrence with the highest accuracy. In order to achieve this, several classification algorithms are tested.

1.LOGISTIC REGRESSION 2.NAIVE BAYES 3.DECISION TREE 4.SVC

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

machine_learning_models

machine_learning_models

README.md

Heart-Disease-Prediction

Files

machine_learning_models

Directory actions

More options

Directory actions

More options

Latest commit

History

machine_learning_models

Folders and files

parent directory

README.md

Heart-Disease-Prediction