Skip to content

Notes about machine learning and statistical methods and how they work

Notifications You must be signed in to change notification settings

sean-1014/data-science-notes

Repository files navigation

Data Science Notes

This is a repository where I dump notes about things I have learned while working as a data scientist. It contains notes about machine learning and statistical methods and how they work.

Why

It's easy to forget what you've already learned if you're not keeping track, especially if you don't really use these things every day. The multiple libraries and statistical packages available make it easy to perform the most well-established methods in statistics and machine learning, essentially turning them into black boxes. And even after acquainting myself with the process/theory behind certain methods, I find myself having to go back and relearn things I've already learned a few months later.

Sources

I don't list down sources in the notebooks because none of these things are my own ideas anyway and I just rephrase them in a way that makes sense to me. Some of the examples are original and I've worked out myself. But here are a few I have used as reference for some of the notes:

  • Hastie, Tibshirani, and Friedman. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. 2009
  • MIT OpenCourseware 6.034 Artificial Intelligence with Dr. Patrick Winston.
  • Andrew Ng's video lectures
  • Josh Starmer's StatQuest videos on YouTube.
  • Sometimes I skim through the original papers to understand the general procedure of the algorithms (e.g. with SAMME, ExtraTrees)
  • Reading Python's scikit-learn code
  • Wikipedia

About

Notes about machine learning and statistical methods and how they work

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published