Full-scan MS data from both LC-MS and MS imaging capture multiple ion forms, including their in/post-source fragments. Here we leverage such fragments to structurally annotate full-scan data from LC-MS or MS imaging by matching against MS/MS spectral libraries.
This workflow requires Python 3.9+. It has been tested on macOS (14.6, M2 Max) and Linux (Ubuntu 20.04).
- Clone the GitHub repository.
git clone git@github.com:Philipbear/ms1_id.git
- Install the dependencies. Typical installation time is <2 min.
pip install -r requirements.txt
- Run
ms1id_lcms.py
for LC-MS data, andms1id_msi.py
for MS imaging data.- An example command for LC-MS data (mzML or mzXML files in
lc_ms/data
folder):python ms1id_lcms.py --project_dir lc_ms --sample_dir data --ms1_id --ms1_id_libs data/gnps.pkl data/gnps_k10.pkl
- An example command for MS imaging data (imzML and ibd files in
msi
folder):python ms1id_msi.py --project_dir msi --libs data/gnps.pkl data/gnps_k10.pkl
- For more options, run
python ms1id_lcms.py --help
orpython ms1id_msi.py --help
.
- An example command for LC-MS data (mzML or mzXML files in
- Output files will be in the project directory. MS1 annotations can be accessed from:
- LC-MS data:
aligned_feature_table.tsv
- MS imaging data:
ms1_id_annotations_derep.tsv
- LC-MS data:
Expected runtime is <1 min for a single LC-MS file and <5 min for a single MS imaging dataset.
Note: Indexed libraries are needed for the workflow. You can download the indexed GNPS library here.
To build your own indexed library, run index_library.py
.
Shipei Xing, Vincent Charron-Lamoureux, Yasin El Abiead, Pieter C. Dorrestein. Annotating full-scan MS data using tandem MS libraries. bioRxiv 2024.
- GNPS MS/MS library
- ALL_GNPS_NO_PROPOGATED.msp, downloaded on July 17, 2024
- Indexed version available here
- LC-MS data
- Pooled chemical standards (GNPS/MassIVE MSV000095789)
- NIST human feces (GNPS/MassIVE MSV000095787)
- IBD dataset (original paper, data)
- MS imaging data
- Mouse brain (original paper, data)
- Mouse body (METASPACE dataset)
- Hepatocytes (METASPACE dataset)
This project is licensed under the Apache 2.0 License (Copyright 2024 Shipei Xing).