-
Notifications
You must be signed in to change notification settings - Fork 5
Home
The recommended method is to use a pre-built Docker or Singularity container.
Both the Docker and Singularity container have the main script
viridian_workflow
installed.
Get a Docker image of the latest release:
docker pull ghcr.io/iqbal-lab-org/cte:latest
All Docker images are listed in the packages page.
To build a docker container, clone this repository and then from its root run:
docker build --network=host .
(without --network=host
you will likely get pip install
timing out and
the build failing).
Releases
include a Singularity image to download.
Each release has a singularity image file called
viridian_workflow_vX.Y.Z.img
, where X.Y.Z
is the release version.
To build a singularity container, clone this repository and then from its root run:
singularity build viridian_workflow.img Singularity.def
The examples below will run the default pipeline, using the built-in SARS-CoV-2 amplicon schemes ARTIC V3, ARTIC V4, and Midnight-1200. The pipeline automatically detects the scheme that best matches the input reads. To use your own amplicon scheme and/or force the choice of scheme, please read the amplicon schemes page. For a more detailed description of the pipeline options, please read the workflow usage page.
To run on paired Illumina reads:
viridian_workflow run_one_sample \
--tech illumina
--ref_fasta data/MN908947.fasta \
--reads1 reads_1.fastq.gz \
--reads2 reads_2.fastq.gz \
--outdir OUT
To run on unpaired nanopore reads:
viridian_workflow run_one_sample \
--tech ont
--ref_fasta data/MN908947.fasta \
--reads reads.fastq.gz \
--outdir OUT
The FASTA file in those commands can be found in the viridian_workflow/amplicon_scheme_data/
directory of this repository.
Other options:
-
--sample_name MY_NAME
: use this to change the sample name (default is "sample") that is put in the final FASTA file, BAM file, and VCF file. -
--keep_bam
: use this option to keep the BAM file of original input reads mapped to the reference genome. -
--force
: use with caution - it will overwrite the output directory if it already exists.
The default files in the output directory are:
-
consensus.fa
: a FASTA file of the consensus sequence. -
variants.vcf
: a VCF file of the identified variants between the consensus sequence and the reference genome. -
log.json
: contains logging information for the viridian workflow run. This is described in detail in the JSON output file page.
If the option --keep_bam
is used, then a sorted BAM file of the reads mapped
to the reference will also be present, called
reference_mapped.bam
(and its index file reference_mapped.bam.bai
).