Skip to content

Index Pipelines

Jacobo Coll Moragón edited this page Sep 26, 2016 · 12 revisions

Overview

Index pipeline is the process of ingesting data into an OpenCGA-Storage backend. We define a general pipeline that is used and extended by the multiple format supported. This pipeline can be extended by additional steps of enrichment, which will be highly dependent on the file format. At the end, the data may be filtered .. [ ... ]

This concept is represented in Catalog to help the tracking of this status in different files.

Index

Indexing data

  • Transform
  • Load
Enrichment
Query / Export

Variant index pipeline

Index
Enrichment
  • Stats calculation
  • Annotation
Query
Metadata

Alignment index pipeline

Clone this wiki locally