Variant Storage Engine

Overview

Study oriented
Cohort definition

Different vcf types:

Aggregated VCFs Variant files with no sample specific values. Just aggregated data
Merged VCFs Variant files with a batch of samples with specific samples data.
gVCFs Single sample files with information for all the positions.

Index Pipeline

Split into steps:

Transform
Load
Annotate
Calculate Stats

1) Transform

Validation
Variant Normalization

2) Load

Variant Merging Plugin dependent.

3) Variant Annotation

Annotate variants using CellBase annotator. Can use other annotators like VEP.

4) Variant Stats

Variant stats (cohorts)
Global stats
Sample stats (pending)

Querying variant data

Once we have loaded variants, it's time to query and get some filtered results. This can be done using the different clients available (Java, Python, JavaScript, R, ...). Read more about the available filters at Querying Variant Data

OpenCGA is an open source project and it is freely available.

General

OpenCGA Catalog

OpenCGA Storage

About

Provide feedback

Saved searches

Use saved searches to filter your results more quickly