Created by Tom White (tom@cloudera.com)
Hail is an open-source, scalable framework for exploring and analyzing genetic data. This repo contains the Hail Tutorial, lightly reformatted to run in Cloudera Data Science Workbench.
Status: In Progress
Use Case: Genetics
Steps:
- Go to Project > Settings > Environment > Spark Configuration: hail-genetics-tutorial/spark-defaults.conf
- Open a CDSW terminal and run setup.sh
- Create a Python Session and run tutorial.py
- When finished, run cleanup.sh in the terminal
Recommended Session Sizes:
Estimated Runtime:
Notes:
- HAIL requires java version 8. If you are running multiple versions on java on your system, you can set the Project Setting's Environmental Varaiables for JAVA_HOME, PATH, etc.
Recommended Jobs/Pipeline: None
Demo Script TBD
Related Content: Video (Internal Only!): https://cloudera.webex.com/cloudera/ldr.php?RCID=af7861670238dc884a134c59ce55049e