Skip to content

mw866/titanic

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Titanic

The classic Titanic data science project using Spark and AWS EMR.

Please refer to ./titanic.ipynb for the code and the wiki for the report.

Instructions: Setup Local Development Environment

Instructions: AWS EMR

  1. Set up SSH Tunnel
  2. Configure SOCKS Proxy in Browser (using Foxy Proxy)
  3. Test the proxy: http://master-public-dns-name/
  4. Accesss the Web Interfaces:
    • Zepplin (Notebook): 8890
    • Spark (Cluster log): 18080
    • Ganglia (Monitoring): */ganglia/
    • Hadoop (MapReduce): 8088/cluster
  • Install Git on AWS EMR: sudo yum install git-all

Reference

Troubleshooting

About

The classic Titanic data science project using Spark on AWS EMR

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published