The S
is silent.
MaCcoyS creates a spectrum specific search space with all available targets matching the precursor(s) of a given MS2 spectrum by fetching them from MaCPepDB. The MS2 spectrum in question is then identified using Comet and the specialized search space.
The PSM validation is not relying on the FDR as usual but on a newly introduced hyper score. Therefore the exponential distribution is fitted to the PSM distribution per spectrum. If the PSM distribution is fitting well, this is tested using the [???] test, the cumulative distribution function is used to calculate a survival score for each spectrum which is the new hyperscore.
TODO: Separating of PSM
Coming as soon as published to crates.io
and pypy.org
. For now see section Development
Comming as soon as published on quay.io. For now see section Build latest
- Clone the repository
- Run
docker build -t local/maccoys:latest -f docker/Dockerfile .
MaCcoyS is expecting the MS data in mzML format. Here is a list of tools to convert different vendor formats
Thermo .raw-file
:- msconvert of Prote Wizard - Enable vendor peak picking. Windows users have GUI. Also available for Linux and Mac (Intel & Apple Silicon) users via docker:
docker run -it -e WINEDEBUG=-all -v $(pwd):/data chambm/pwiz-skyline-i-agree-to-the-vendor-licenses wine msconvert ...
- ThermoRawFileParserGUI & ThermoRawFileParser - Standard settings are fine. Not working on Apple Silicon.
- msconvert of Prote Wizard - Enable vendor peak picking. Windows users have GUI. Also available for Linux and Mac (Intel & Apple Silicon) users via docker:
Bruker .d-folders
:- (tdf2mzml)[https://github.com/mafreitas/tdf2mzml] - CLI only. Use
--ms1_type centroid
- (tdf2mzml)[https://github.com/mafreitas/tdf2mzml] - CLI only. Use
MaCcoyS has a integrated pipeline for batch processing which can be used locally or on distributed system.
Great for testing and playing around
- Generate a comet config
- native:
comet -p
- docker:
docker run --rm -it --entrypoint "" local/maccoys:dev bash -c 'comet -p > /dev/null; cat comet.params.new' > comet.params.new
- native:
- Generate a pipeline config
- native:
maccoys pipeline new-config
- docker:
docker run --rm -it local/maccoys:dev pipeline new-config
- native:
- Adjust both configs to your MS-parameter and experimental design
- Run the pipeline
- native:
maccoys -vvvvvv -l <PATH_TO_LOG_FILE> pipeline local-run <RESULT_FOLDER_PATH> <PATH_TO_MACCOYS_CONFIG_TOML> <PATH_TO_COMET_CONFIG> <MZML_FILES_0> <MZML_FILE_1> ...
- docker:
docker run --rm -it local/maccoys:dev pipeline new-config
- native:
docker run --rm -it -v <ABSOLUTE_PATH_ON_HOST_>:/data local/maccoys:dev -vvvvvv -l /data/logs/maccoys.log pipeline local-run /data/results /data/<PATH_TO_MACCOYS_CONFIG_TOML> /data/<PATH_TO_COMET_CONFIG> /data/experiment/<MZML_FILES_0> /data/experiment/<MZML_FILE_1> ...
Checkout ... pipline --help
and have a look on the optional parameter and the parameter descriptions which might be helpful.
The pipeline can also be deployed on multiple machines. Have a look into Procfile
do get an idea of the setup. The only requirement for the deployment is that each part of the pipeline has access to the results folder, e.g. via NFS, and has access to a central redis server.
You can test it via:
- Shell 1:
docker compose up
- Shell 2:
ultraman start
(you can use any Procfile manager, like (ultraman)[https://github.com/yukihirop/ultraman], (foreman)[https://github.com/ddollar/foreman] or (honcho)[https://github.com/nickstenning/honcho/tree/main])
Using the following command, the search is send to the remote entrypoint and scheduled:
- native:
maccoys -vvvvvv pipeline remote-run <API_BASE_URL> <PATH_TO_SEARCH_PARAMETER_TOML> <PATH_TO_COMET_CONFIG> <MZML_FILES_0> <MZML_FILE_1> ...
- docker:
docker run --rm -it -v <ABSOLUTE_PATH_ON_HOST_>:/data local/maccoys:dev -vvvvvv pipeline local-run <API_BASE_URL> /data/<PATH_TO_SEARCH_PARAMETER_TOML> /data/<PATH_TO_COMET_CONFIG> /data/experiment/<MZML_FILES_0> /data/experiment/<MZML_FILE_1> ...
MaCcoyS will print an UUID to identify the search, e.g. to recheck the progress with ... -vvvvvvv pipeline search-monitor <API_BASE_URL> <UUID>
and find the results.
- Rust: The recommended way is to use rustup
- Conda | Mamba | Micromamba: All three have the same CLI. Just use Micromamba, it's the fastest.
micromamba env create -f environment.yaml
micromamba activate maccoys
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$(dirname $(dirname $(which python)))/lib
, this will add environments Python libraries into theLD_LIBRARY_PATH
-variable so they ara availabe for the Rust compilercargo build
, rustup should install the needed Rust version and compile