Skip to content

Latest commit

 

History

History
39 lines (27 loc) · 5.72 KB

README.md

File metadata and controls

39 lines (27 loc) · 5.72 KB
description
Making sense of how to organize import files (genomic and clinical) for cBioPortal project imports

File Formats

The following table is organized by: the type of genomic data, its required format (also cBioPortal suggested filename convention - the link provided directs users to the specific section in the official cBioPortal.org's documentation website), comments, and the mandatory associated meta file for the data file. For every data file in the import directory, there must be a meta file for it.

To setup and create a "minimal" cBioPortal project, you will need the following files:

NOTE: The more data that is provided, the better your project will be, and will be able to use cBioPortal's features. Creating a project with the bare minimal data will not be very useful to the user.

{% hint style="danger" %}

Please use consistent sample IDs across all files (genomic and clinical files) – watch out for underscores and dashes!

{% endhint %}

Data Types Summary Table

Platform (data type)Required formatAlternate formatsNotesAssociated meta file
Clinical (Sample centric)data_clinical_samples.txtSamples file is mandatory; this file is where the mapping of the sample IDs happen (all sample IDs across all files, genomic and clinical files)meta_clinical_samples.txt
Clinical (Patient centric)data_clinical_patients.txtmeta_clinical_patients.txt
WGS (SNV/Indels)data_mutations_extended.txtMust be in MAF format; Run vcf2mafmeta_mutations_extended.txt
WGS (Structural Variants)data_sv.txtmavis_summary_*.tab filecBioportal is still working on this; if you have fusions file instead, its finemeta_sv.txt
RNA-Seq (Expression)data_expressions.txtmeta_expressions.txt
RNA-Seq (Fusion)data_fusions.txtmavis_summary_*.tab filemeta_fusions.txt
Segmented (Seg)data_segments.segmeta_segments.txt
Discrete Copy Number (CNA)data_CNA.txtmeta_CNA.txt
Methylationdata_methylation.txtmeta_methylation.txt
Protein (RPPA/Mass Spectrometry)protein datadata_protein_quantification.txtDepends if you have RPPA or Mass Spectrometry data; Log2value or z-score datatypesmeta_protein_quantification.txt
data_protein_quantification_Zscores.txtmeta_protein_quantification_Zscores.txt
data_rppa.txtmeta_rppa.txt
data_rppa_Zscores.txtmeta_rppa_Zscores.txt
data_phosphoprotein_quantification.txtmeta_phosphoprotein_quantification.txt
Other formats

{% hint style="warning" %} A samples file (aka data_clinical_samples.txt) is mandatory for a project import! It is the key file in mapping of the project's IDs. {% endhint %}

Skeleton of the directory structure

An example of how a directory of import files would look like

{% hint style="info" %} For more information on what the case_list directory is and what case files are, please refer to the Case List Files section {% endhint %}