Flight Analytics with sparklyR Created by Michiaki Ariga (aki@cloudera.com)
Steps:
- Open a CDSW terminal and run setup.sh
- Create a R Session and run setup.R
- Run flight-analytics.R in the same R Session.
- Optionally, run flight-regression.R in an R session. This will take a longer amount of time and you may want to run it in another session.
- When finished, run cleanup.R in your R session and cleanup.sh in the terminal
Recommended Session Sizes: 4 CPU, 8 GB RAM
Recommended Jobs/Pipeline:
None
Notes:
Estimated Runtime:
flight-analytics.R --> approx 1 min
flight-regression.R --> approx 15 min
Demo Script
TBD
Related Content:
http://blog.cloudera.com/blog/2017/02/analyzing-us-flight-data-on-amazon-s3-with-sparklyr-and-apache-spark-2-0/