Skip to content

Commit

Permalink
updated tools docu
Browse files Browse the repository at this point in the history
  • Loading branch information
marc-sturm committed Jul 3, 2023
1 parent 2a1d247 commit 1d9da0c
Show file tree
Hide file tree
Showing 8 changed files with 104 additions and 40 deletions.
7 changes: 5 additions & 2 deletions doc/tools/FastqConcat.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
### FastqConcat tool help
FastqConcat (2020_03-159-g5c8b2e82)
FastqConcat (2023_03-107-g2a1d2478)

Concatinates several FASTQ files into one output FASTQ file.

Expand All @@ -10,6 +10,8 @@
Optional parameters:
-compression_level <int> Output FASTQ compression level from 1 (fastest) to 9 (best compression).
Default value: '1'
-long_read Support long reads (> 1kb).
Default value: 'false'

Special parameters:
--help Shows this help and exits.
Expand All @@ -18,8 +20,9 @@
--tdx Writes a Tool Definition Xml file. The file name is the application name with the suffix '.tdx'.

### FastqConcat changelog
FastqConcat 2020_03-159-g5c8b2e82
FastqConcat 2023_03-107-g2a1d2478

2023-06-15 Added support for long reads.
2020-07-15 Added 'compression_level' parameter.
2019-04-08 Initial version of this tool
[back to ngs-bits](https://github.com/imgag/ngs-bits)
7 changes: 5 additions & 2 deletions doc/tools/FastqTrim.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
### FastqTrim tool help
FastqTrim (2020_03-159-g5c8b2e82)
FastqTrim (2023_03-107-g2a1d2478)

Trims start/end bases from all reads in a FASTQ file.

Expand All @@ -16,6 +16,8 @@
Default value: '0'
-compression_level <int> Output FASTQ compression level from 1 (fastest) to 9 (best compression).
Default value: '1'
-long_read Support long reads (> 1kb).
Default value: 'false'

Special parameters:
--help Shows this help and exits.
Expand All @@ -24,8 +26,9 @@
--tdx Writes a Tool Definition Xml file. The file name is the application name with the suffix '.tdx'.

### FastqTrim changelog
FastqTrim 2020_03-159-g5c8b2e82
FastqTrim 2023_03-107-g2a1d2478

2023-06-15 Added support for long reads.
2020-07-15 Added 'compression_level' parameter.
2016-08-26 Added 'len' parameter.
[back to ngs-bits](https://github.com/imgag/ngs-bits)
36 changes: 20 additions & 16 deletions doc/tools/NGSDExportAnnotationData.md
Original file line number Diff line number Diff line change
@@ -1,28 +1,30 @@
### NGSDExportAnnotationData tool help
NGSDExportAnnotationData (2022_12-82-g025eb99e)
NGSDExportAnnotationData (2023_03-107-g2a1d2478)

Generates a VCF file with all variants and annotations from the NGSD and a BED file containing the gene information of the NGSD.

Mandatory parameters:
-variants <file> Output variant list as VCF.
Export information aboug germline variants, somatic variants and genes form NGSD for use as annotation source, e.g. in megSAP.

Optional parameters:
-genes <file> Optional BED file containing the genes and the gene info (only germline).
-germline <file> Export germline variants (VCF format).
Default value: ''
-somatic <file> Export somatic variants (VCF format).
Default value: ''
-genes <file> Exports BED file containing genes and gene information.
Default value: ''
-reference <file> Reference genome FASTA file. If unset 'reference_genome' from the 'settings.ini' file is used.
Default value: ''
-test Uses the test database instead of on the production database.
Default value: 'false'
-max_af <float> Maximum allel frequency of exported variants (default: 0.05).
-max_af <float> Maximum allel frequency of exported variants (germline).
Default value: '0.05'
-gene_offset <int> Defines the number of bases by which the region of each gene is extended.
-gene_offset <int> Defines the number of bases by which the regions of genes are extended (genes).
Default value: '5000'
-mode <enum> Determines the database which is exported.
Default value: 'germline'
Valid: 'germline,somatic'
-vicc_config_details Includes details about VICC interpretation. Works only in somatic mode.
-vicc_config_details Includes details about VICC interpretation (somatic).
Default value: 'false'
-threads <int> Number of threads to use.
Default value: '5'
-verbose Enables verbose debug output.
Default value: 'false'
-debug Enables debug output (germline only).
-max_vcf_lines <int> Maximum number of VCF lines to write per chromosome - for debugging.
Default value: '-1'
-test Uses the test database instead of on the production database.
Default value: 'false'

Special parameters:
Expand All @@ -32,8 +34,10 @@
--tdx Writes a Tool Definition Xml file. The file name is the application name with the suffix '.tdx'.

### NGSDExportAnnotationData changelog
NGSDExportAnnotationData 2022_12-82-g025eb99e
NGSDExportAnnotationData 2023_03-107-g2a1d2478

2023-06-18 Refactoring of command line parameters and parallelization of somatic export.
2023-06-16 Added support for 'germline_mosaic' column in 'variant' table and added parallelization.
2021-07-19 Code and parameter refactoring.
2021-07-19 Added support for 'germline_het' and 'germline_hom' columns in 'variant' table.
2019-12-06 Comments are now URL encoded.
Expand Down
22 changes: 22 additions & 0 deletions doc/tools/NGSDExportGff.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
### NGSDExportGff tool help
NGSDExportGff (2023_03-107-g2a1d2478)

Writes all transcripts and exons of all genes to a gff3 file.

Mandatory parameters:
-out <file> The output file directory.

Optional parameters:
-test Uses the test database instead of on the production database.
Default value: 'false'

Special parameters:
--help Shows this help and exits.
--version Prints version and exits.
--changelog Prints changeloge and exits.
--tdx Writes a Tool Definition Xml file. The file name is the application name with the suffix '.tdx'.

### NGSDExportGff changelog
NGSDExportGff 2023_03-107-g2a1d2478

[back to ngs-bits](https://github.com/imgag/ngs-bits)
10 changes: 7 additions & 3 deletions doc/tools/NGSDExportSamples.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
### NGSDExportSamples tool help
NGSDExportSamples (2023_02-56-g0fe5818f)
NGSDExportSamples (2023_03-107-g2a1d2478)

Lists processed samples from the NGSD.

Expand All @@ -12,6 +12,8 @@
Default value: 'false'
-no_tumor If set, tumor samples are excluded.
Default value: 'false'
-no_normal If set, germline samples are excluded.
Default value: 'false'
-no_ffpe If set, FFPE samples are excluded.
Default value: 'false'
-match_external_names If set, also samples for which the external name matches 'sample' are exported.
Expand Down Expand Up @@ -50,7 +52,9 @@
Default value: 'false'
-run_device <string> Sequencing run device name filter.
Default value: ''
-run_before <string> Sequencing run start date before or equal to the given date.
-run_before <string> Sequencing run before or equal to the given date.
Default value: ''
-run_after <string> Sequencing run after or equal to the given date.
Default value: ''
-no_bad_runs If set, sequencing runs with 'bad' quality are excluded.
Default value: 'false'
Expand Down Expand Up @@ -79,7 +83,7 @@
--tdx Writes a Tool Definition Xml file. The file name is the application name with the suffix '.tdx'.

### NGSDExportSamples changelog
NGSDExportSamples 2023_02-56-g0fe5818f
NGSDExportSamples 2023_03-107-g2a1d2478

2022-11-11 Added 'ancestry' and 'phenotypes' filter options.
2022-03-03 Added 'disease_group', 'disease_status', 'project_type' and 'tissue' filter options.
Expand Down
21 changes: 11 additions & 10 deletions doc/tools/NGSDExportStudyGHGA.md
Original file line number Diff line number Diff line change
@@ -1,24 +1,25 @@
### NGSDExportStudyGHGA tool help
NGSDExportStudyGHGA (2022_12-82-g025eb99e)
NGSDExportStudyGHGA (2023_03-107-g2a1d2478)

Exports meta data of a study from NGSD to a JSON format for import into GHGA.

Mandatory parameters:
-data <file> JSON file with data that is not contained in NGSD.
-out <file> Output JSON file.
-samples <file> TSV file with pseudonym, SAP ID and processed sample ID
-data <file> JSON file with data that is not contained in NGSD.
-out <file> Output JSON file.

Optional parameters:
-test Test mode: uses the test NGSD, does not calcualte size/checksum of BAMs, ...
Default value: 'false'
-test Test mode: uses the test NGSD, does not calcualte size/checksum of BAMs, ...
Default value: 'false'

Special parameters:
--help Shows this help and exits.
--version Prints version and exits.
--changelog Prints changeloge and exits.
--tdx Writes a Tool Definition Xml file. The file name is the application name with the suffix '.tdx'.
--help Shows this help and exits.
--version Prints version and exits.
--changelog Prints changeloge and exits.
--tdx Writes a Tool Definition Xml file. The file name is the application name with the suffix '.tdx'.

### NGSDExportStudyGHGA changelog
NGSDExportStudyGHGA 2022_12-82-g025eb99e
NGSDExportStudyGHGA 2023_03-107-g2a1d2478

2023-01-31 Initial implementation (version 0.9.0 of schema).
[back to ngs-bits](https://github.com/imgag/ngs-bits)
25 changes: 25 additions & 0 deletions doc/tools/NGSDImportSampleQC.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
### NGSDImportSampleQC tool help
NGSDImportSampleQC (2023_03-107-g2a1d2478)

Imports QC metrics of a sample into NGSD.

Mandatory parameters:
-ps <string> Processed sample name.
-files <filelist> qcML files to import.

Optional parameters:
-force Overwrites already existing QC metrics instead of throwing an error.
Default value: 'false'
-test Uses the test database instead of on the production database.
Default value: 'false'

Special parameters:
--help Shows this help and exits.
--version Prints version and exits.
--changelog Prints changeloge and exits.
--tdx Writes a Tool Definition Xml file. The file name is the application name with the suffix '.tdx'.

### NGSDImportSampleQC changelog
NGSDImportSampleQC 2023_03-107-g2a1d2478

[back to ngs-bits](https://github.com/imgag/ngs-bits)
16 changes: 9 additions & 7 deletions doc/tools/VariantFilterAnnotations.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
### VariantFilterAnnotations tool help
VariantFilterAnnotations (2023_03-63-gec44de43)
VariantFilterAnnotations (2023_03-107-g2a1d2478)

Filter a variant list in GSvar format based on variant annotations.

Expand Down Expand Up @@ -41,7 +41,8 @@
Count NGSD Filter based on the hom/het occurances of a variant in the NGSD.
Parameters:
max_count - Maximum NGSD count [default=20] [min=0]
ignore_genotype - If set, all NGSD entries are counted independent of the variant genotype. Otherwise, for homozygous variants only homozygous NGSD entries are counted and for heterozygous variants all NGSD entries are counted. [default=false]
ignore_genotype - If set, all variants in NGSD are counted independent of the genotype. Otherwise, for homozygous variants only homozygous NGSD variants are counted and for heterozygous variants homozygous and heterozygous NGSD variants are counted. [default=false]
mosaic_as_het - If set, mosaic variants are counted as heterozygous. Otherwise, they are not counted. [default=false]
Filter column empty Filter that perserves variants which have no entry in the 'filter' column.
Filter columns Filter based on the entries of the 'filter' column.
Parameters:
Expand Down Expand Up @@ -126,8 +127,9 @@
max_af_nor - Maximum allele frequency in normal sample [%] [default=1] [min=0.0] [max=100.0]
Splice effect Filter based on the predicted change in splice effect
Parameters:
MaxEntScan - Minimum percentage change in the value of MaxEntScan. Positive min. increase, negative min. decrease. Disabled if set to zero. [default=-15]
SpliceAi - Minimum SpliceAi value. Disabled if set to zero. [default=0.5] [min=0] [max=1]
SpliceAi - Minimum SpliceAi score. Disabled if set to zero. [default=0.5] [min=0] [max=1]
MaxEntScan - Minimum predicted splice effect. Disabled if set to LOW. [default=HIGH] [valid=HIGH,MODERATE,LOW]
splice_site_only - Use native splice site predictions only. Skip de-novo acceptor/donor predictions (MaxEntScan). [default=true]
action - Action to perform [default=FILTER] [valid=KEEP,FILTER]
Text search Filter for text match in variant annotations.
The text comparison ignores the case.
Expand All @@ -141,8 +143,8 @@
build - Genome build used for pseudoautosomal region coordinates [default=hg38] [valid=hg19,hg38]
Tumor zygosity Filter based on the zygosity of tumor-only samples. Filters out germline het/hom calls.
Parameters:
het_af_range - Consider allele frequencies of 50% ± het_af_range as heterozygous and thus as germline. [default=0] [min=0] [max=49.9]
hom_af_range - Consider allele frequencies of 100% ± hom_af_range as homozygous and thus as germline. [default=0] [min=0] [max=99.9]
het_af_range - Consider allele frequencies of 50% ± het_af_range as heterozygous and thus as germline. [default=0] [min=0] [max=49.9]
hom_af_range - Consider allele frequencies of 100% ± hom_af_range as homozygous and thus as germline. [default=0] [min=0] [max=99.9]
Variant quality Filter for variant quality
Parameters:
qual - Minimum variant quality score (Phred) [default=250] [min=0]
Expand Down Expand Up @@ -173,7 +175,7 @@
--tdx Writes a Tool Definition Xml file. The file name is the application name with the suffix '.tdx'.

### VariantFilterAnnotations changelog
VariantFilterAnnotations 2023_03-63-gec44de43
VariantFilterAnnotations 2023_03-107-g2a1d2478

2018-07-30 Replaced command-line parameters by INI file and added many new filters.
2017-06-14 Refactoring of genotype-based filters: now also supports multi-sample filtering of affected and control samples.
Expand Down

0 comments on commit 1d9da0c

Please sign in to comment.