Skip to content

This project provides hands-on experience in DNA analysis using R, covering sequence manipulation, mutation analysis, GC content, fragment analysis, sequence alignment, RNA-seq, variant analysis, and phylogenetics.

Notifications You must be signed in to change notification settings

lisabensoussan/DNA_lab

Repository files navigation

README for DNA Analysis Project


Project Overview:

This project consists of a series of laboratory exercises focused on DNA analysis using R, covering various computational techniques for analyzing genomic data. Each lab explores a different facet of data wrangling, visualization, and statistical analysis in the context of DNA sequencing and genomics.


Labs Overview:

  1. Lab 1: Introduction to DNA Data Analysis

    • Focus: Basic DNA sequence manipulation, extraction, and visualization techniques using R.
    • Key Tools: ggplot2, dplyr, and basic R functions for string manipulation and plotting.
  2. Lab 2: DNA Mutation Analysis

    • Focus: Identifying and analyzing mutations within DNA sequences.
    • Key Tools: Mutation frequency analysis, comparative genomics, and visualization of mutations across different samples.
  3. Lab 3: GC Content Analysis

    • Focus: Calculating GC content in DNA sequences and its implications for genomic stability.
    • Key Tools: Sliding window algorithms for GC content analysis and R plotting libraries for visualizing GC content distribution.
  4. Lab 4: DNA Fragment Analysis

    • Focus: Analyzing DNA fragment lengths and their distribution in genomic samples.
    • Key Tools: Histograms, density plots, and statistical tests to compare fragment lengths across different conditions.
  5. Lab 5: Sequence Alignment

    • Focus: Aligning DNA sequences and evaluating the quality of alignments.
    • Key Tools: Pairwise and multiple sequence alignment techniques, BLAST, and visualization of alignment results.
  6. Lab 6: Phylogenetic Tree Construction

    • Focus: Constructing phylogenetic trees based on DNA sequence similarity.
    • Key Tools: Distance matrices, neighbor-joining methods, and tree visualization libraries.
  7. Lab 7: RNA Sequencing Data Analysis

    • Focus: Analyzing RNA sequencing data to study gene expression levels.
    • Key Tools: RNA-seq data processing, differential expression analysis, and visualizing expression levels with heatmaps and volcano plots.
  8. Lab 8: DNA Variant Analysis

    • Focus: Identifying and analyzing single nucleotide polymorphisms (SNPs) and other variants in DNA sequences.
    • Key Tools: Variant calling tools, annotation of variants, and visualization of variant distribution across populations.
  9. Final Project: Comprehensive DNA Data Analysis

    • Focus: A final comprehensive analysis combining all the techniques learned in previous labs to analyze a complete DNA dataset.
    • Key Tools: A combination of sequence alignment, mutation analysis, GC content, fragment analysis, and variant calling to provide a holistic view of genomic data.

Requirements:

  • R Libraries:

    • ggplot2, dplyr, Biostrings, phytools, seqinr, and more.
  • Data:

    • Publicly available DNA sequencing datasets.
  • Software:

    • RStudio, Jupyter Notebook with R kernel, or a similar environment for R-based DNA analysis.

Instructions for Use:

  1. Download the repository and open each .Rmd file corresponding to the lab you are working on.
  2. Ensure that all required libraries are installed before running the scripts.
  3. Run the cells sequentially in the RMarkdown file to perform the DNA analysis.
  4. Review the results and visualizations generated after each lab section to understand the genomic insights provided by the analysis.

Author:

  • Lisa Mechaly Bensoussan
  • Emmanuelle Fareau
  • Dan levy

About

This project provides hands-on experience in DNA analysis using R, covering sequence manipulation, mutation analysis, GC content, fragment analysis, sequence alignment, RNA-seq, variant analysis, and phylogenetics.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages