The most challenging topics for students have been 1) parsing text with RDDs and 2) understanding parallel computing concepts
These practices will help:
- realize that if you find these hard, it's expected. Conversely, the other topics might be easier!
- devote adequate time to review notes, practice, and complete the labs
- skim the labs early so you can begin thinking about them
- work with classmates, the TA and the instructor
The course has three areas of focus:
- Using Spark for batch processing, stream processing, ML, and analytics
- Big data and cloud computing tools - learning concepts and using tools
- Working on team project with Spark MLlib
Students enjoy the group project
Reminder that each student writes their own code in lab assignments