description |
---|
Making sense of how to organize import files (genomic and clinical) for cBioPortal project imports |
The following table is organized by: the type of genomic data, its required format (also cBioPortal suggested filename convention - the link provided directs users to the specific section in the official cBioPortal.org's documentation website), comments, and the mandatory associated meta file for the data file. For every data file in the import directory, there must be a meta file for it.
To setup and create a "minimal" cBioPortal project, you will need the following files:
NOTE: The more data that is provided, the better your project will be, and will be able to use cBioPortal's features. Creating a project with the bare minimal data will not be very useful to the user.
{% hint style="danger" %}
Please use consistent sample IDs across all files (genomic and clinical files) – watch out for underscores and dashes!
{% endhint %}
Platform (data type) | Required format | Alternate formats | Notes | Associated meta file |
---|---|---|---|---|
Clinical (Sample centric) | data_clinical_samples.txt | Samples file is mandatory; this file is where the mapping of the sample IDs happen (all sample IDs across all files, genomic and clinical files) | meta_clinical_samples.txt | |
Clinical (Patient centric) | data_clinical_patients.txt | meta_clinical_patients.txt | ||
WGS (SNV/Indels) | data_mutations_extended.txt | Must be in MAF format; Run vcf2maf | meta_mutations_extended.txt | |
WGS (Structural Variants) | data_sv.txt | mavis_summary_*.tab file | cBioportal is still working on this; if you have fusions file instead, its fine | meta_sv.txt |
RNA-Seq (Expression) | data_expressions.txt | meta_expressions.txt | ||
RNA-Seq (Fusion) | data_fusions.txt | mavis_summary_*.tab file | meta_fusions.txt | |
Segmented (Seg) | data_segments.seg | meta_segments.txt | ||
Discrete Copy Number (CNA) | data_CNA.txt | meta_CNA.txt | ||
Methylation | data_methylation.txt | meta_methylation.txt | ||
Protein (RPPA/Mass Spectrometry) | protein data | data_protein_quantification.txt | Depends if you have RPPA or Mass Spectrometry data; Log2value or z-score datatypes | meta_protein_quantification.txt |
data_protein_quantification_Zscores.txt | meta_protein_quantification_Zscores.txt | |||
data_rppa.txt | meta_rppa.txt | |||
data_rppa_Zscores.txt | meta_rppa_Zscores.txt | |||
data_phosphoprotein_quantification.txt | meta_phosphoprotein_quantification.txt | |||
Other formats |
{% hint style="warning" %} A samples file (aka data_clinical_samples.txt) is mandatory for a project import! It is the key file in mapping of the project's IDs. {% endhint %}
An example of how a directory of import files would look like
{% hint style="info" %} For more information on what the case_list directory is and what case files are, please refer to the Case List Files section {% endhint %}