Skip to content

Latest commit

 

History

History
28 lines (16 loc) · 2.37 KB

File metadata and controls

28 lines (16 loc) · 2.37 KB

Predicting the Likelihood of Diabetes Using Common Signs and Symptoms

About one-third of patients with diabetes do not know that they have diabetes according to the findings published by many diabetes institutes around the world. Detecting and treating diabetes patients at early stages is critical in order to keep them healthy and to ensure their quality of life is not compromised. Early detection will also help to mitigate the risk of serious complications like heart disease & stroke, blindness, limb amputations, and kidney failures as a result of diabetes. The data set consists of signs and symptoms of 516 newly diabetic or would be diabetic patients, who presented at Sylhet Diabetes Hospital in Sylhet, Bangladesh. The data had been collected using the direct questionnaires method at the hospital under the supervisor of Doctors. The Source for the data set is the UCI Machine Learning Repository at, https://archive.ics.uci.edu/ml/datasets/Early+stage+diabetes+risk+prediction+dataset. The data set has 16 descriptive features and one target feature. This study intends to build a logistic regression model to predict the likelihood of having diabetes using common signs and symptoms presented by patients. A successful model will enable early detection of diabetes through signs and symptoms shown by possible patients. This study consists of two phases: 1) Phase I - preprocess and explore the data set in order to make it ready to consume for model development. 2) Phase II - build a logistic regression model to predict the likelihood of having diabetes based on signs and symptoms. The Phase I part has already been completed under previous work/submission and this report intends to cover the work carried out for Phase II. All the activities have been performed in the R package and the report has been compiled using R-Markdown.

Source for the Data-set:

https://archive.ics.uci.edu/ml/datasets/Early+stage+diabetes+risk+prediction+dataset.

Descriptive feartues in the data-set:

Descriptive_features.csv

Phase I:

Pre-processing and exploring the data-set in order to build the model in Phase II

data-set: Phase1_Data.csv

R Code: Phase1_Code.RMD

Report: Phase1_Report.pdf

Phase II:

Building a logistic regression model to predict the likelihood of having diabetes based on signs and symptoms

data-set: Phase2_Data.csv

R Code: Phase2_Code.RMD

Report: Phase2_Report.pdf