Skip to content

JaniceLC/BigDataAnalyticsTutorials

Repository files navigation

BigDataAnalyticsTutorials

Tutorials for STAT4609 (2023) Big Data Analytics

Example Class 1 Intro

  1. introduction to python
  2. Object-Oriented Programming (OOP) in Python 3
  3. PCA

Example Class 2

  1. Never use 'print()' to debug again! Pysnooper is all you need!
  2. Lasso Regression using scikit-learn :)
  3. Application of tree-based methods with SKlearn, including, decision tree, random forest, boosting.
  4. Implementation of Decision Tree from scratch

Example Class 3 Kernel Based Method

  1. Kernel regression review
  2. Kernel regression implementation from scratch [Simplified example for HW3]

Example Class 4

  1. A1 & A2 extension: Implementation of Lasso Logistic Regression for Binary Classification from scratch

  2. GAM

  3. Implementation of Naive Bayes from scratch

  4. Implementation of K-Means Clustering from scratch

Example Class 5 Latent Variable Models

  1. Non-negative Matrix Factorization using Alternating Least Squares [Important for HW5]
  2. User-based Collaborative Filtering from scratch, which is easy to extend for the item-based case.
  3. User-based Collaborative Filtering using Surprise [Surprise is an easy-to-use Python scikit for recommender systems.]
  4. Pytorch Introduction I

Example Class 6 Networks

  1. Graph Basics with NetworkX
  2. Spectral Clustering Implementation from Scratch
  3. Comparison with K-means Clustering using Scikit-Learn
  4. Image Segmentation using Spectral Clustering
  5. Hierarchical Clustering with Scikit-Learn on both graph and tabular data

Example Class 7 NLP

  1. TF-IDF:
  • Implementation from Scratch
  • Use SKLearn
  1. Text Classification with TF-IDF
  • using Naive Bayes Classifier
  • using Logistic Regression

Example Class 8 SVM + Deep Learning

  1. SVM implementation from scratch
  2. A gentle introduction to torch.autograd.
  3. Perceptron_and_neural_networks.

Example Class 9 Deep Learning

Example Class 10 Latent Dirichlet Allocation

About

Tutorials for STAT4609 (2022) Big Data Analytics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published