Skip to content

Commit

Permalink
Final README adjustments
Browse files Browse the repository at this point in the history
  • Loading branch information
mrbarbitoff committed Dec 14, 2022
1 parent d6c0dfd commit e7fa404
Showing 1 changed file with 11 additions and 5 deletions.
16 changes: 11 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ conda env create -f requirements.yml
```
## Required input data
* VCF file of variants for further annotation
* BED file with available uORFs (`sorted.v3.bed` in this repository)
* BED file with available uORFs (`sorted.v4.bed` in this repository)
* GTF file with genomic features annotation
* \[optional\] TSV file with gene-level gnomAD constraint statistics

Expand All @@ -26,7 +26,8 @@ python uORF_annotator.py \
```
## Output formats specification
### tab-separated (tsv) file
Each row represents annotation of a single variant in particular uORF (per uORF annotation). Fields in the file have the following content:

Two TSV outputs are generated - one for ATG-started uORFs and one - for non-ATG-started ones. Each row represents annotation of a single variant in particular uORF (per uORF annotation). Fields in the file have the following content:

1) #CHROM - contig name
2) POS - position
Expand All @@ -52,16 +53,21 @@ Each row represents annotation of a single variant in particular uORF (per uORF
The generated VCF output contains all variants affecting uORF sequences. Each variant is annotated with the following INFO fields: `uORFs`, `uORFs_ATG`, `uORFs_eff`. The description of fields is given below:

* `uORFs` - a full consequence annotation for each variant-uORF combination. Format: 'ORF_START|ORF_END|ORF_SYMB|ORF_CONSEQ|main_cds_effect|in_known_CDS|in_known_ORF|utid|overlapping_type|dominance_type|codon_type'
* `uORFs_ATG` - a flag indicating if a variant falls within ATG-starting uORF.
* `uORFs_eff` - a short notation of main CDS effect. ext - N-terminal extension, overl - out-of-frame overlap, activ - overlap removal with possible main ORF activation, unaff - no effect on main CDS.
* `uORFs_ATG` - a flag indicating if a variant falls within at least one ATG-starting uORF.
* `uORFs_eff` - a short notation of how a change in uORF structure resulting from a variant affects the main coding part (СDS) of a gene. ext - N-terminal extension, overl - out-of-frame overlap, activ - overlap removal with possible main ORF activation, unaff - no effect on main CDS. If the variant falls into more than one uORF, the effects on them are listed through &

### BED format

A BED file generated by the *uORF Annotator* contains all uORFs affected by variants that alter the uORF length. The BED file contains one entry for each affected uORF, and one entry for each variant-uORF combination that leads to changes in anticipated length of uORF product. Color legend:
*uORF Annotator* generates two BED files with uORFs affected by variants that alter the uORF length (one file contains ATG uORFs and the other contains non-ATG-started uORFs). Both BED files contain two entries for each affected uORF:
1) initial uORF, its `name` field format: uORF_unique_number-gene_name|uORF_type|start_codon_type(ATG/non-AT), filled with black color;
2) resulting uORF after introduction of a variant, its `name` field format: uORF_unique_number-gene_name|variant|variant_type|main_CDS_effect, filled with different colors depending on the effect.

Color legend:
* Grey features - cases when the variant does not change the overlap between uORF and main CDS.
* Orange features - cases when (a) uORF-truncating variant eliminates the existing overlap between uORF and main CDS; or (b) variant leads to the production of a chimeric protein product of the gene, possessing an extension at the N-terminus resulting from uORF translation
* Red features - cases where variant leads to the appearance of a new overlapping segment between uORF and main gene CDS, with the two sequences translated in different frames.


## Supplementary data files

This repository contains two additional files:
Expand Down

0 comments on commit e7fa404

Please sign in to comment.