This project involves the segmentation of a dataset into multiple tables, which are then imported into an SQL database, with primary and foreign keys set up for relational integrity. The dataset was sourced from Kaggle and is structured to facilitate efficient data analysis.
- Columns:
PatientID
(VARCHAR, Primary Key, Foreign Key)Cholesterol
(INT)BloodPressure
(VARCHAR)HeartRate
(INT)Triglycerides
(INT)BMI
(DOUBLE)
- Columns:
PatientID
(VARCHAR, Primary Key, Foreign Key)Smoking
(BOOLEAN)Obesity
(BOOLEAN)AlcoholConsumption
(BOOLEAN)Diet
(VARCHAR)PhysicalActivityDaysPerWeek
(INT)SleepHoursPerDay
(DOUBLE)SedentaryHoursPerDay
(DOUBLE)ExerciseHoursPerWeek
(DOUBLE)
- Columns:
PatientID
(VARCHAR, Primary Key, Foreign Key)Diabetes
(BOOLEAN)PreviousHeartProblems
(BOOLEAN)MedicationUse
(BOOLEAN)StressLevel
(INT)
- Columns:
PatientID
(VARCHAR, Primary Key)Age
(INT)Sex
(VARCHAR)FamilyHistory
(BOOLEAN)Country
(VARCHAR)Continent
(VARCHAR)Hemisphere
(VARCHAR)
- Columns:
PatientID
(VARCHAR, Primary Key, Foreign Key)HeartAttackRisk
(BOOLEAN)
- Columns:
PatientID
(VARCHAR, Primary Key, Foreign Key)Income
(INT)
-
Primary Keys:
- Each table has a
PatientID
as the primary key, ensuring that each record is uniquely identifiable.
- Each table has a
-
Foreign Keys:
- The
PatientID
inHealthMetrics
,Lifestyle
,MedicalHistory
,RiskAssessment
, andSocioeconomicStatus
tables serves as a foreign key that references thePatientID
in thePatients
table. This enforces referential integrity across the database.
- The
The dataset used in this project was sourced from Kaggle. It was split into several tables to normalize the data structure, improve query performance, and make the data easier to manage.
- The data types were adjusted to match the intended use case (e.g., converting certain integer columns to BOOLEAN where applicable).
- Primary keys and foreign keys were established to maintain data integrity and relationships between tables.
- Import the SQL files into your database management system.
- The database is ready for use with any SQL-based queries for analysis, reporting, or application development.