Skip to content

Latest commit

 

History

History
70 lines (40 loc) · 3.18 KB

usage.md

File metadata and controls

70 lines (40 loc) · 3.18 KB

Usage information

Basic execution

Pipeline version

Resources

Basic execution

Please see our installation guide to learn how to set up this pipeline first.

A basic execution of the pipeline looks as follows:

a) Without a site-specific config file

nextflow run marchoeppner/gabi -profile singularity --input samples.csv \\
--reference_base /path/to/references \\
--run_name pipeline-test

where path_to_references corresponds to the location in which you have installed the pipeline references (this can be omitted to trigger an on-the-fly temporary installation, but is not recommended in production).

In this example, the pipeline will assume it runs on a single computer with the singularity container engine available. Available options to provision software are:

-profile singularity

-profile docker

-profile podman

-profile conda

Additional software provisioning tools as described here may also work, but have not been tested by us. Please note that conda may not work for all packages on all platforms. If this turns out to be the case for you, please consider switching to one of the supported container engines.

b) with a site-specific config file

nextflow run marchoeppner/gabi -profile lsh --input samples.csv \\
--run_name pipeline-test 

In this example, both --reference_base and the choice of software provisioning are already set in the local configuration lsh and don't have to be provided as command line argument.

Specifying pipeline version

If you are running this pipeline in a production setting, you will want to lock the pipeline to a specific version. This is natively supported through nextflow with the -r argument:

nextflow run marchoeppner/pipeline -profile lsh -r 1.0 <other options here>

The -r option specifies a github release tag or branch, so could also point to main for the very latest code release. Please note that every major release of this pipeline (1.0, 2.0 etc) comes with a new reference data set, which has the be installed separately.

Resources

The following options can be set to control resource usage outside of a site-specific config file.

--max_cpus [ default = 16]

The maximum number of cpus a single job can request. This is typically the maximum number of cores available on a compute node or your local (development) machine.

--max_memory [ default = 128.GB ]

The maximum amount of memory a single job can request. This is typically the maximum amount og RAM available on a compute node or your local (development) machine. Typically it is advisable to set this a little lower than the maximum amount of RAM to prevent the machine from swapping.

--max_time[ default = 240.h ]

The maximum allowed run/wall time a single job can request. This is mostly relevant for environments where run time is restricted, such as in a computing cluster with active resource manager or possibly some cloud environments.