Skip to content

Variant Storage Engine

Jacobo Coll Moragón edited this page Jun 2, 2016 · 7 revisions

Overview

  • Study oriented
  • Cohort definition

Different vcf types:

  • Aggregated VCFs Variant files with no sample specific values. Just aggregated data
  • Merged VCFs Variant files with a batch of samples with specific samples data.
  • gVCFs Single sample files with information for all the positions.

Index Pipeline

Split into steps:

  1. Transform
  2. Load
  3. Annotate
  4. Calculate Stats

1) Transform

2) Load

  • Variant Merging Plugin dependent.

3) Variant Annotation

Annotate variants using CellBase annotator. Can use other annotators like VEP.

4) Variant Stats

  • Variant stats (cohorts)
  • Global stats
  • Sample stats (pending)

Querying variant data

Once we have loaded variants, it's time to query and get some filtered results. This can be done using the different clients available (Java, Python, JavaScript, R, ...). Read more about the available filters at Querying Variant Data

Clone this wiki locally