Solutions of the Code Challenge Questions for the Coursera course.
DESCRIPTION OF THE COURSE: In this course, you will learn how entire genomes are assembled from millions of short overlapping pieces of DNA. The scale of this problem (the human genome is 3 billion nucleotides long!) implies that computers must be involved. Yet the problem is even more complex than it may appear ... to solve it, we will need to travel back in time to meet three famous mathematicians, and learn about algorithms based on graph theory.
Later in the course, we will see that sequencing genomes is not the only task related to decoding biological macromolecules. Another difficult problem is sequencing antibiotics, short mini-proteins engineered by bacteria to fight each other. Even though antibiotics often contain fewer than 10 amino acids, sequencing them is a formidable challenge. Decoding the sequence of amino acids making up an antibiotic is an important biomedical problem, but the practical barriers to sequencing short antibiotics are often more substantial than barriers to assembling a genome with millions of nucleotides! To address this computational challenge, we will learn about brute force algorithms that often succeed in various bioinformatics applications.
Finally in this course, you will learn how to apply popular bioinformatics software tools to assemble a deadly Staphylococcus bacterium. You will also be introduced to the popular cloud service BaseSpace offered by Illumina, the leading DNA sequencing company, thus joining the thousands of biologists and bioinformaticians who use BaseSpace every day.
Course Syllabus
How Do We Assemble Genomes? (Graph Algorithms)
Exploding Newspapers
The String Reconstruction Problem
String reconstruction as a walk in the overlap graph
Another graph for string reconstruction
Walking in the de Bruijn graph
The seven bridges of Konigsberg
Euler's Theorem
From Euler's Theorem to an Algorithm for Finding Eulerian Cycles
Assembling genomes from read-pairs
Epilogue: Genome assembly faces real sequencing data
How Do We Sequence Antibiotics? (Brute Force Algorithms)
The Discovery of Antibiotics
How Do Bacteria Make Antibiotics?
Dodging the Central Dogma
Sequencing Antibiotics by Shattering them into Pieces
A Brute Force Algorithm for Cyclopeptide Sequencing
A Branch-and-Bound Algorithm for Cyclopeptide Sequencing
Just How Fast Is This Algorithm?
Adapting Cyclopeptide Sequencing for Spectra with Errors
From 20 to More than 100 Amino Acids
The Spectral Convolution Saves the Day
Epilogue: From Simulated to Real Spectra
Bioinformatics Application Challenge: Sequencing a Staphylococcus aureus genome