-
Notifications
You must be signed in to change notification settings - Fork 2
Home
Pipeline to generate consensus.fasta files and identify pangolin lineage and nextstrain clade of Sars-CoV-2 genomes from ONT sequencing.
This pipeline takes as input a folder with name <run_name> which contains the folders fast5_pass
and fastq_pass
from Sars-CoV-2 ONT sequencing together with a CSV-file which links barcode and sample name, and it outputs consensus.fasta files along with <run_name>_report.csv
which includes pangolin lineage, nextstrain clade, mutations and QC.
python ~/Programs/git/covid-genomics/SARSCoV2_pipeline.py --input_dir /path/to/<run_name> --sample_names sample_names.csv
Where:
-
--input_dir
: Input directory<run_name>
must containfast5_pass
andfastq_pass
folders, with the<run_name>
corresponding to your run (e.g. 20210213_1359_X5_FAO88697_5cf6e6f0) -
--sample_names
: A CSV-file which connects barcodes with sample names, following the format:
barcode,sample_name
NB01,NEGCONTROL
NB02,E1234567_P1
NB02,V2345678_P1
Note: Basecalling and demultiplexing may also be performed if not already done on GridION/MinIT.
- What does the pipeline do?
- How to run
- Installation
- QC parameters and how to change these
- What does the output look like
- How to run the commands manually
Please note that this script is intended for the use at the AMR lab at Stavanger University Hospital, you may need to change it (sprecifically the scripts/config.cfg
file) for it to work in your environment.
This pipeline uses scripts and pipelines from the [Artic network's nCoV-2019 novel coronavirus bioinformatics protocol] (https://artic.network/ncov-2019/ncov2019-bioinformatics-sop.html). See also the Parameters page.
Many thanks to the artic, pangolin and nextclade developers for creating the protocols and pipelines!