Skip to content

ML code to accompany the Synthetic Fermentation HTE publication

License

Notifications You must be signed in to change notification settings

jugoetz/synferm-predictions

Repository files navigation

ML Training and Inference for Synthetic Fermentation

License: MIT Code style: black

Preprint: Predicting Three-Component Reaction Outcomes from 40k Miniaturized Reactant Combinations

For code used to collect experimental data, see this repository.

Installation

conda env create -f environment.yaml

or (if you don't have a suitable GPU):

conda env create -f environment_cpuonly.yaml

Notes:

  • The environment.yaml file is written for a workstation with Nvidia GPU. Use environment_cpuonly.yaml instead to run only on CPU. This will not install CUDA and will install the CPU-only versions of pytorch and dgl.
  • Installing the dgl dependency through conda sometimes creates issues where some packages are "not found" despite existing in the specified channels. Instead, try installing dgl separately with pip:
    pip install dgl -f https://data.dgl.ai/wheels/repo.html
  • On some systems with outdated libraries (such as university clusters) dgl wheels may not work, and you may need to build it from source. See the DGL installation guide for more information.
  • There is an issue with pytorch and the 2024.1.x version of the mkl dependency. If an ImportError [...] undefined symbol: iJIT_NotifyEvent occurs, downgrade with conda install mkl=2024.0

Log in to WandB

We track training runs with WandB. Before starting any training runs, you need to log into WandB by running

wandb login

then supply your API key.

Training models

The run.py script serves as an entrypoint for training models. It is configured with a set of command line arguments, including the path to a configuration file with model hyperparameters. See config/config_example.yaml for an example configuration file.

To see the full list of command line arguments, run:

python run.py train --help

Predicting using trained models

The inference.py script serves as an entrypoint for predicting reaction outcome. It expects a CSV file with three columns: initiator, monomer, terminator. See config/config_example.yaml for an example configuration file.

Call it like:

python inference.py -i example_reactants.csv -o out.csv

or use python inference.py --help for more information.

Development

We use nbstripout to remove output from notebooks before committing to the repository. Install with:

conda install -c conda-forge nbstripout # or pip install nbstripout
nbstripout --install  # configures git filters and attributes for this repo

We use pre-commit hooks to ensure standardized code formatting. Install with:

pre-commit install

About

ML code to accompany the Synthetic Fermentation HTE publication

Resources

License

Stars

Watchers

Forks

Packages

No packages published