Skip to content

Latest commit

 

History

History
68 lines (51 loc) · 5.7 KB

README.md

File metadata and controls

68 lines (51 loc) · 5.7 KB

MIxS: Minimum Information about any (X) Sequence

This repository contains the source material for the Genomic Standards Consortium (GSC) Minimum Information about any (X) Sequence (MIxS) standard.

MIxS, or the Minimum Information about any (X) Sequence is a standard for describing the contextual information about the sampling and sequencing of any genomic sequence. The standard has Terms that describe characteristics of a sample that addresses:

  • What is the source of the sequence?
  • In what kind of environment was the sample collected?
  • What methods were utilized to process the sample?

Following the release of MIxS v6.0, subsequent releases (e.g. MIxS 6.1) are represented in and maintained using the LinkML framework. LinkML uses YAML to define schemas. The user-focused/developer-focused sections of the repository structure provide details on where to find these YAML files that are defining the standard.

The MIxS standards are found at: https://genomicsstandardsconsortium.github.io/mixs/

Terms The individual metadata terms are provided in the table: here. These Terms are attributes or properties that describe samples and their sequence-associated metadata. Broadly, MIxS metadata Terms are represented in genomic Checklists, environmental Extensions, and Combinations (of Checklists and Extensions).

Checklists Checklists include the required, recommended and optional metadata fields (Terms) for a specific type of genomic sequence (e.g. genome, metagenome, microbiome, marker gene, MAG or single cell genome). The MIGS genomic sequences checklist, for example, supports taxa or subcellular structure specific checklists (Eukaryotes, Bacteria, Viruses, Organelle, Plants).

Extensions Extensions include Terms that describe specific environments from which a sample was collected. For example, the Agriculture Extension (MIxS-Ag) includes terms to describe agricultural environments.

Combinations MIxS Checklists and Extensions are designed to be modular, supporting mix and match combinations of any genomic checklist with terms from any environmental extension, to create MIxS Combinations. For example, a Combination of the MIMS Checklist and Agriculture Extension, called MIMSAgriculture.

Repository Structure

Below are descriptions for the various user-facing directories in this repository.

  • examples/ - examples of different kinds of data files in different data formats in different (JSON, YAML) containing data conformant with MIxS standard
  • mixs-templates/ - MIxS schema metadata collection templates in the Excel spreadsheet (.xlsx) format. These templates can be utilized for organizing a project's metadata in prepartion for submission to a sequence data archive.
  • project/ - artifacts autogenerated by the suite of generators in the linkml library. Artifacts include JSON-LD, OWL, JSON Schema, spreadsheet, etc. representations of the schema
  • src/
    • src/data/examples/ - valid and invalid data examples
    • The folder structure (valid and invalid folders), and the corresponding YAML data examples in this directory need to follow guidelines in accordance with the linkml-run-examples testing framework
    • src/mixs/
    • src/docs - markdown files that can be converted to HTML and included in the web documentation pages

Developer Documentation

Note: Developer documention is specifically included here for the use of members of the GSC's CIG and TWG committees.

Use the `make` command to generate project artefacts:
  • make all: make everything
  • make deploy: deploys site

Documentation about the contents of the developer-focussed folders/directories in this repository.

LinkML:

The MIxS utilizes LinkML, see: linkml-project-cookiecutter.