Skip to content

Latest commit

 

History

History
11 lines (7 loc) · 1.39 KB

README.md

File metadata and controls

11 lines (7 loc) · 1.39 KB

seedExtension

This is the code of my bachelor's thesis, with the title "Extension of Seeds for Multiple Alignments in Genome Graphs".

Supervisor: Prof. Dr. Mario Stanke, the inventor of the gene prediction tool Augustus.

Abstract

The number of sequenced genomes is rapidly increasing. However, the annotation process for these genomes is more elaborate and therefore slower. In order to reduce this discrepancy, new robust methods of genome annotation must be developed. The Geometric Hashing Algorithm (GH), extended in the course of this work by the Seed Extension Algorithm (SE), addresses this task. The seeds created in GH are intended to be filtered, and this is implemented with a gapless seed extension. This idea can be easily realized in linearly stored sequences. This tool, however, utilizes a colored de Bruijn graph, which can have storage advantages. SE transfers the concept of gapless seed extension to the graph. The filtered seeds are then intended to reliably identify coding sequences and thus support the annotation process.

Dependencies

https://github.com/mabl3/metagraphInterface