A minimal working example of using Snakemake on an HPC running PBS-torque or SLURM.
You need Python3 and Snakemake installed on a high performance computing cluster running either the PBS-torque or SLURM job scheduler.
Rather than using the modules provided on our University's HPC, I prefer to use conda to manage my software dependencies.
Download the latest installer for Anaconda (includes everything) or Miniconda (recommended because it includes only the minimum and is faster to install).
e.g. for Linux:
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
and run the installer:
bash Miniconda3-latest-Linux-x86_64.sh
I like to create a separate conda environment for each of my projects. Example for this tutorial:
Create a new conda environment called snakemake
and install Python & Snakemake from the bioconda channel:
conda create -n snakemake -c bioconda python=3 snakemake
Alternatively, you can re-create an environment from a YAML file like so:
conda env create -f path/to/environment.yml
Example environment files:
- UMich Flux:
config/env.smk-flux.yml
- UMich Great Lakes:
config/env.smk-gl.yml
Before submitting jobs for your project, activate the environment:
conda activate snakemake
The packages installed in snakemake
are then available for any jobs you submit while the environment is activated.
See the conda user guide for more details.
-
Edit
config/cluster.json
with:- Your email
- The default job parameters
- e.g. walltime, nodes, memory, etc.
- These can be overriden by individual rules in the Snakemake file.
-
Edit
code/submit-pbs.sh
and/orcode/submit-slurm.sh
with:- Your email
- HPC account
- Queue/partition
-
Dry run
snakemake -s code/myworkflow.smk --dryrun
to make sure your workflow doesn't have any syntax errors.
-
Submit the workflow
With PBS-torque:
qsub code/submit-pbs.sh
Or with SLURM:
sbatch code/submit-slurm.sh
From this job, snakemake submits all of your other jobs using the parameters in
config/cluster.json
.