This is was mostly a 'practice' repository, containing some ML algorithms which I have implemented from scratch.
I no longer update or maintain this.
- K Nearest Neighbours: Recommends movies from the TMDB 5000 movies dataset based on the list of genres given as input.
- Logistic Regression: Predicts how likely peope are to buy a product based on their gender, age, and salary.
- Simple Neural Network: 2-layered neural network which mimics the XOR gate, implemented(vectorized) from scratch using NumPy.
- Digit Classification: Dataset used: MNIST
- Contains a binary classifier that labels all 0s as 1 and rest all digits as 0.
- Also contains an extension of the above classifier that classifies all 10 digits with an accuracy of 94%.
- Both of the above networks are 2-layered and are implemented(vectorized) from scratch using NumPy.
- Decison Trees: Decision Tree classifier implemented from scratch in python. Dataset used: Banknote authentication dataset
- Support Vector Machine: A simple C-SVM binary classifier. Dataset used: Breast Cancer Wisconsin Dataset
- K-Means Clustering:
- Dataset used: Synthetic 2-d data with N=5000 vectors and k=15 Gaussian clusters with different degree of cluster overlap
- Implemented K-Means clustering algorithm. Used
matplotlib
to visualize clusters and centroids.
- Principal Component Analysis:
- Dataset used: AT&T Database of Faces
- Applied the Principal Component Analysis (PCA) algorithm for dimensionality reduction on face images.
- Moving Averages
- Dataset used: Air Quality Data Set
- Applied Simple Moving Average (SMA), Cumulative Moving Average (CMA), Weighted Moving Average (WMA), Exponentially Weighted Average (EWMA) on the dataset, all functions are written in NumPy.
- Histogram Equalization
- Covers the theory behind histogram equalization
- Harshit Varma (
hrshtv
)