Skip to content

Commit

Permalink
Merge pull request #52 from Ferlab-Ste-Justine/fix/CLIN-3706-support-…
Browse files Browse the repository at this point in the history
…more-file-extensions-for-raw-gvcf

fix: CLIN-3706 support more input gvcf file extensions
  • Loading branch information
LysianeBouchard authored Dec 18, 2024
2 parents 309507f + a6e9c27 commit ea8a954
Show file tree
Hide file tree
Showing 19 changed files with 966 additions and 103 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
### `Fixed`
- [#50](https://github.com/Ferlab-Ste-Justine/Post-processing-Pipeline/pull/50) Use container tag 1.20 for splitMultiAllelics process
- [#51](https://github.com/Ferlab-Ste-Justine/Post-processing-Pipeline/pull/51) Add missing ressources for exomiser process in configuration
- [#52](https://github.com/Ferlab-Ste-Justine/Post-processing-Pipeline/pull/52) Ensure .gvcf file extensions are supported in all scenarios

### `Known issues`
- The nf-core modules that we are using have a potential performance flaw. Typically, the regex used to describe the output files also match the input files (ex: "*.vcf"), which can cause unnecessary file transfers. This has already proven to cause issues on fusion. One fix could be to transfer the whole modules to local to perform the small change necessary to fix this.
Expand Down
15 changes: 8 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,15 +14,16 @@
It performs joint genotyping, tags low-quality variants, and optionally annotates the final vcf data using vep and/or prioritize variant using exomiser.

### Summary:
1. Remove MNPs using bcftools
2. Normalize .gvcf
3. Combine .gvcf
4. [Joint-genotyping](https://gatk.broadinstitute.org/hc/en-us/articles/360037057852-GenotypeGVCFs)
5. Tag false positive variants with either:
1. Standardize input vcf files using bcftools view
2. Remove MNPs using bcftools
3. Normalize .gvcf
4. Combine .gvcf
5. [Joint-genotyping](https://gatk.broadinstitute.org/hc/en-us/articles/360037057852-GenotypeGVCFs)
6. Tag false positive variants with either:
- For whole genome sequencing data: [Variant quality score recalibration (VQSR)](https://gatk.broadinstitute.org/hc/en-us/articles/360036510892-VariantRecalibrator)
- For whole exome sequencing data: [Hard-Filtering](https://gatk.broadinstitute.org/hc/en-us/articles/360036733451-VariantFiltration)
6. Optionnally annotate variants with [Variant effect predictor (VEP)](https://useast.ensembl.org/info/docs/tools/vep/index.html)
7. Optionnally integrate phenotype data to annotate, filter and prioritise variants likely to be disease-causing with [exomiser](https://www.sanger.ac.uk/tool/exomiser/)
7. Optionnally annotate variants with [Variant effect predictor (VEP)](https://useast.ensembl.org/info/docs/tools/vep/index.html)
8. Optionnally integrate phenotype data to annotate, filter and prioritise variants likely to be disease-causing with [exomiser](https://www.sanger.ac.uk/tool/exomiser/)



Expand Down
6 changes: 6 additions & 0 deletions conf/modules.config
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,12 @@ process {

publishDir = new_publish_dir()

withName: BCFTOOLS_VIEW {
container = 'staphb/bcftools:1.20'
ext.args = { '-Oz --write-index=tbi' }
ext.prefix = {meta.id + ".standardized.g"}
}

withName: BCFTOOLS_FILTER {
container = 'staphb/bcftools:1.20'
ext.args = {'-e \'strlen(REF)>1 & strlen(REF)==strlen(ALT) & TYPE="snp"\' -Oz --write-index=tbi'}
Expand Down
Binary file modified docs/images/ferlab_workflow.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
180 changes: 102 additions & 78 deletions docs/images/ferlab_workflow.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4 changes: 0 additions & 4 deletions docs/usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -85,10 +85,6 @@ You can optionally skip this step by setting the `exclude_mnps` parameter to `fa

Note that MNPs are not supported by the VQSR procedure, so you cannot skip this step if you have whole genome data.

Additionally, if you skip the exclusion of MNPs, ensure that your input GVCF files are indexed or that they are compressed with bgzip.
If the index file is missing, the workflow will attempt to generate it, but the input GVCF file must be compressed with bgzip for this to work.


### Tools

You can include additional analysis in your pipeline via the `tools` parameter. Currently, the pipeline supports
Expand Down
5 changes: 5 additions & 0 deletions modules.json
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,11 @@
"git_sha": "33ef773a7ea36e88323902f63662aa53c9b88988",
"installed_by": ["modules"]
},
"bcftools/view": {
"branch": "master",
"git_sha": "666652151335353eef2fcd58880bcef5bc2928e1",
"installed_by": ["modules"]
},
"ensemblvep/vep": {
"branch": "master",
"git_sha": "6e3585d9ad20b41adc7d271009f8cb5e191ecab4",
Expand Down
5 changes: 5 additions & 0 deletions modules/nf-core/bcftools/view/environment.yml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

76 changes: 76 additions & 0 deletions modules/nf-core/bcftools/view/main.nf

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

88 changes: 88 additions & 0 deletions modules/nf-core/bcftools/view/meta.yml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading

0 comments on commit ea8a954

Please sign in to comment.