v.0.0.2 - Tuan Nguyen
git clone https://github.com/tuannguyen8390/nf-EXPLOR.git
The pipeline deployed multiple bioinformatics software for the detection of Single Nucldeotide Polymorphism (SNPs) & Structural Variants (SVs). The pipeline (version 0.0.3) currently freely available & it was designed to deal with data from both Oxford Nanopore as well as PacBio (However we only test at the moment with ONT). Written with Nextflow DSL2.
Installation guide for Docker can be found here
Installation guide for Shifter can be found here
Installation guide for Singularity can be found here
Nextflow should operate on any system you installing it on (whether it is PBS, SLURM, AWS, Google Cloud...), all you need to do is open the nextflow.config
file & edit a few things based on your own configuration (marked as "BASED PARAMETERS", these including things like analysis directory, executor used, adjusting computational resource...
🚩 I suggest backing up the original nextflow.config
so you have a reference later on.
3. Pull assets (genome - ARS2.0, we suggest using this genome for reproducibility across partner of the consortium), then perform some initial setup
Run the following command to pull assets (genome) and perform some initial setup (choose 1 among Shifter/Docker/Singularity only)
nextflow run setup.nf -profile shifter/docker/singularity
Edit the nextflow.config files to suit your local environment
nextflow run setup.nf -profile shifter/docker/singularity,test
5. 🚀 Run the pipeline. The pipeline works using 2 metadata spreadsheet in the meta
folder, in which:
🚩 metadata_SR.csv
: metadata for short-read data
🚩 metadata_LR.csv
: metadata for long-read data
nextflow run main.nf -profile shifter/docker/singularity
nextflow run main.nf -profile shifter/docker/singularity,awsbatch
-
FiltLong : QC for both LongReads and ShortReads ( DEFAULT + RECOMMENDED)
-
NanoFilt + FMLRC2 : NanoFilt for QC of Long-Read samples, and FMLRC2 + NanoFilt for QC of Short-Read samples ( Currently NOT COMPATIBLE with PEPPER & DEEPVARIANT - use with caution !!!)
-
Minimap2 : ( DEFAULT for BovLRC participants)
3. SNP Caller: All callers can be run in parallel & deploy per chromosome ( Chr 1 - 29 & X & Y as the pipe currently deployed in cattle )
-
Clair3 : ( DEFAULT for BovLRC participants) - Please note that extra ONT models can be found on Clair3_rerio_models
-
PEPPER - By default, Flowcell < 10.4 will be analyzed with PEPPER
-
DEEPVARIANT - By default, Flowcell >= 10.4 will be analyzed with DEEPVARIANT & HIFI
- PRE/POST QC : NanoPlot
- Alignment Depth : Mosdepth
- MultiQC
I've absolutely no doubt that there should be some problems :). It runs on my end, but perhaps not yours. If that is the case, please email to Tuan Nguyen ⭐