This repository contains .ipynb
files for each concept covered in DSCI 502 - Data Mining at Scale. a
These are transcribed directly from the lecture videos. I am not the author of these lectures. I claim no ownership over the presentation or material. I merely transcribed them for easier reference during the course.
Textbooks referenced include:
- Mining of Massive Data Sets, by Lescovec et al; Third Edition; Cambridge University Press 2020.
- Spark: the Definitive Guide ,by Zaharia and Chambers; O’Reilly 2018.
Instructor: C Wedrychowicz, Saint Mary's College
contact the instructor at wedrych@saintmarys.edu