Project for the course of Computational Human Genomics, held by Prof. Francesca Demichelis (a.y. 2021-2022)
The data for this project can be found here: https://drive.google.com/drive/folders/1s0OmsoLuXIixuvDGW5dHN2rLolsvWsv5?usp=sharing
We started from two BAM files (Tumor and matched Control), along with other files (reference genome, annotation files, ...) with the aim to perform Germline Variant Calling, Somatic Small and Copy-Number Variant Calling, Ancestry Analysis, and Tumor Ploidy and Purity estimation.
The project is aimed at answers 10 specific tasks:
- Explore statistics about the raw aligned reads contained in the two BAM files
- Perform realignment and recalibration on those
- Identify and annotate heterozygous SNPs
- Determine the ancestry of the patients
- Identify somatic copy number variants
- Identify somatic point mutations
- Determine how DNA Repair Genes have been impacted by germline CNVs and SNPs
- Determine which DNA repair genes overlap both germline heterozygous copy-number deletions and somatic point mutations
- Determine tumor purity and ploidy
- Determine the similarity of Tumor and Control samples