Ferlab-Ste-Justine/Post-processing-Pipeline is a bioinformatics pipeline designed for family-based analysis of GVCFs from multiple samples. It performs joint genotyping, tags low-quality variants, and optionally annotates the final vcf data using vep and/or prioritize variant using exomiser.
- Standardize input vcf files using bcftools view
- Remove MNPs using bcftools
- Normalize .gvcf
- Combine .gvcf
- Joint-genotyping
- Tag false positive variants with either:
- For whole genome sequencing data: Variant quality score recalibration (VQSR)
- For whole exome sequencing data: Hard-Filtering
- Optionnally annotate variants with Variant effect predictor (VEP)
- Optionnally integrate phenotype data to annotate, filter and prioritise variants likely to be disease-causing with exomiser
The full Ferlab workflow is shown in the image below, including the steps applicable prior to this pipeline. The steps relevant to the Ferlab-Ste-Justine/Post-processing-Pipeline correspond to the post-processing block.
This schema was done using inkscape with the good pratices recommended by the nf-core community. See nf-core Graphic Design.
Here is an example nextflow command to run the pipeline:
nextflow run -c cluster.config Ferlab-Ste-Justine/Post-processing-Pipeline -r "v2.4.1" \
-params-file params.json \
--input samplesheet.csv \
--outdir results/dir \
--tools vep,exomiser
Note
If you are new to nextflow and nf-core, please refer to this page on how to set-up nextflow.
Warning
Please provide pipeline parameters via the CLI or nextflow -params-file
option. Custom config files including those provided by the -c
nextflow option can be used to provide any configuration except for parameters;
see docs.
For more details, see docs/usage.md and docs/reference_data.md.
The -stub
(or -stub-run
) option can be added to run the "stub" block of processes instead of the "script" block. This can be helpful for testing.
To test your setup in stub mode, simply run nextflow run Ferlab-Ste-Justine/Post-processing-Pipeline -profile test,docker -stub
.
For tests with real data, see documentation in the test configuration profile
Path to output directory must be specified via the outdir
parameter.
See docs/output.md for more details about pipeline outputs.
Ferlab-Ste-Justine/Post-processing-Pipeline was originally written by Damien Geneste, David Morais, Felix-Antoine Le Sieur, Jeremy Costanza, Lysiane Bouchard.
If you would like to contribute to this pipeline, please see the contributing guidelines.
The documentation of the various tools used in this workflow are available here:
GATK:
This pipeline uses code and infrastructure developed and maintained by the nf-core community, reused here under the MIT license.
The nf-core framework for community-curated bioinformatics pipelines.
Philip Ewels, Alexander Peltzer, Sven Fillinger, Harshil Patel, Johannes Alneberg, Andreas Wilm, Maxime Ulysse Garcia, Paolo Di Tommaso & Sven Nahnsen.
Nat Biotechnol. 2020 Feb 13. doi: 10.1038/s41587-020-0439-x.