Skip to content

An R pipeline to process low pass whole genome sequencing and call copy number variation

License

Notifications You must be signed in to change notification settings

crickbabs/LowPassKaryo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Introduction

An R pipeline for automatic processing of low-pass whole genome sequencing data to detect copy number variation using the QDNASeq package.

Pipeline summary

  1. Raw read QC (FastQC)
  2. Adapter/Quality trimming (Trimgalore)
  3. Post trimming QC (FastQC)
  4. Alignment (bwa v0.7.15-r1140)
  5. Sorting and indexing (Samtools)
  6. Copy number calling (QDNASeq)
  7. Summary report generation (R)

Typical workflow

After low-pass whole genome sequencing of a number of samples, a typical workflow will involve

  1. Creating a design file associating each set of FastQ files with the appropriate sample, genome and annotation information.
  2. Passing this design file to the main LowPassKaryo_Wrapper.R script which will sanity check the parameters and then handle submission of procesing jobs to your HPC cluster/farm.
  3. On sucessful completion, the pipeline will produce one pdf file containing QDNASeq copy number profiles for each species included in the processing run and an html report containing primary alignment QC metrics and recording the software versions used.

Details of the local configuration required to set up the pipeline and also instructions on how to subsequently run it may be found in the DOCS/ directory

  1. Local Config
  2. Usage

About

An R pipeline to process low pass whole genome sequencing and call copy number variation

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages