GitHub repository created to submit HomeWork 4 for Algorithmic Methods of Data Mining.
Work carried out by Group 4 consisting of the following members:
- Erica Capocello : 1862113
- Luca Mazzucco : 1997610
- Main file is HW_5.ipynb Notebook which contains the code related to:
- Data Setup
- Hashing
- LSH: Local Sensitive Hasing
- PCA: Principal Component Analysis
- Clustering: K-Means, K-Means++, Hierarchical Clustering
- Bonus Question.
- The module.py file contains functions needed for hashing tasks → preprocessing query_users.csv
- CommandLine_HW4.sh is the Command Line Question in .sh format.
- Algorithmic_Question.ipynb is the answer to the Algorithmic Question.