This notebook uses the dataset from UCI Machine Learning Repository and classification models to predict whether a patient has Parkinson's Disease, with a focus on modele improvement on Logistic Regression.
- Load and Oberserve Data
- Data Preprocessing
- Min-Max Normalization
- Problem Definition
Each modeling section consists of an brief intro, the algorithm, discussion on related topics, and application on the dataset.
- kNN
- Non-parametric Models
- Algorithm
- Naive Bayes
- Bayes Classifier
- Algorithm
- Generative Model vs. Discriminative Model
- Logistic Regression
- Sigmoid Function
- Maximum Likelihood Estimation
- Algorithm
- (Also see Appendix A & B for related topics)
- Support Vector Machine
- Convex Sets and Convex Hulls
- Algorithm
- Soft-Margin SVM
- Kernel SVM
- Kernel
- Mercer's theorem
- RBF
- Algorithm
- Decision Tree
- PCA
- Pipeline
- ROC Curve
- Change Threshold
- Classification Report
-
Appendix A: Concepts for Logistic Regression
- A1. Binary Classification
- A2. Log Odds
- A3. Linear Discriminant Analysis
-
Appendix B: Linear Classifiers
- B1. Definition
- B2. Linear Separability
- B3. Methods