Unifying Subject-Sample mapping files for HDD data #56

nboukharov · 2017-01-09T16:10:58Z

Subject-Sample mapping files for Expression, Metabolomics, MIRNA_QPCR, MIRNA_SEQ, Protein, RBM and RNASeq Data have slight differences that create unnecessary issues for curators. We would like to have one format for all HDD data maping files, the same as is used for Expression data: STUDY_ID, SITE_ID, SUBJECT_ID, SAMPLE_ID, PLATFORM, TISSUETYPE, ATTR1, ATTR2, CATEGORY_CD, SOURCE_CD
The only mandatory fields should be STUDY_ID, SUBJECT_ID, SAMPLE_ID PLATFORM and CATEGORY_CD. Other columns should be allowed to be null. If a specific loading procedure requires one of the optional columns to have a value, a default value should be inserted (e.g. "Unknown" for TISSUETYPE, "STD" for CATEGORY_CD). Unified mapping file loading procedure should be back compatible and flexible. Both ATTR1 and ATTRIBUTE_1, STUDY_ID and TRIAL_NAME should be acceptable for respective columns. All "tokens" (SITE_ID, PLATFORM, TISSUETYPE, ATTR1, ATTR2) should be allowed to be used in the CATEGORY_CD in any order (don't have to have values in ATTR1 to use ATTR2)

mirasrael assigned mirasrael and baroleg and unassigned mirasrael Jan 10, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unifying Subject-Sample mapping files for HDD data #56

Unifying Subject-Sample mapping files for HDD data #56

nboukharov commented Jan 9, 2017

Unifying Subject-Sample mapping files for HDD data #56

Unifying Subject-Sample mapping files for HDD data #56

Comments

nboukharov commented Jan 9, 2017