Release single-step model environments #15

kmaziarz · 2023-07-26T14:31:10Z

This PR releases instructions for building an appropriate environment for each of the supported single-step models. I've managed to unify all models apart from GLN into a shared base (containing mostly just python, torch and rdkit) which each model type extends with a handful of model-specific dependencies. One last piece still missing are checkpoint files for each model type.

For now, the setup instructions live under reaction_prediction/environments. Longer term we could consider forking the external single-step model repositories and adding simple pip-installability to them, which could allow us to streamline the setup steps further through installation extras, e.g. be able to do something like pip install syntheseus[retro-knn]. Currently the dependencies in the environments are mostly pinned for reproducibility, but if we go with the installation extras approach then we can unpin them wherever possible.

… files

fiberleif

LGTM!

Just one question, l noticed that there is an environment.yml file in the root directory of Syntheseus. When should users install this file, and when should users install the environment_shared.yml file located in reaction_prediction/environments directory ?

Some clarifications in the Readme might be helpful?

kmaziarz · 2023-07-27T11:04:59Z

l noticed that there is an environment.yml file in the root directory of Syntheseus. When should users install this file, and when should users install the environment_shared.yml file located in reaction_prediction/environments directory ?

When trying to set up just the core of syntheseus (to e.g. plug in your own single-step models) you'd use the environment from the top-level, while if trying to plug in specific single step models, you'd follow the instructions from reaction_prediction/environments (most of which start with using the environment_shared.yml file but not GLN).

We should think how/where to best explain this to the users though (perhaps easier when we also add the model checkpoints).

AustinT

Looks good to me. Only suggestion is updating top-level README.

syntheseus/reaction_prediction/environments/README.md

syntheseus/reaction_prediction/environments/environment_shared.yml

syntheseus/reaction_prediction/environments/gln/Dockerfile

syntheseus/reaction_prediction/environments/setup_shared.sh

…repos

README.md

There seem to be two issues with `setup_megan.sh` that evaded the testing done in #15: - Two necessary dependencies (`gitpython` and `scipy`) are missing - The version of `torchtext` used forces `pip` to switch to a different version of `torch` than the one used in the shared environment This PR addresses both problems.

Continuing from previous PRs (most notably #15), this PR releases trained model checkpoints for all currently integrated baselines.

kmaziarz added 2 commits July 26, 2023 13:56

feat(environments): Release all single-step model environments

c787215

chore(.gitignore): Ignore environments/external/

8f3a89a

kmaziarz requested review from fiberleif and mrwnmsr July 26, 2023 14:31

kmaziarz added 4 commits July 26, 2023 15:53

chore(environments): Drop set -x, add shebangs

db32cac

fix(root_aligned): Use imports starting from the root of the repository

0af230f

fix(root_aligned): Prevent the inference code from making unnecessary…

7a1b277

… files

fix(retro_knn): Fix setup script

435dc79

kmaziarz requested a review from AustinT July 26, 2023 18:36

fiberleif approved these changes Jul 27, 2023

View reviewed changes

chore(environments): Use a simpler PYTHONPATH

62d79ec

AustinT approved these changes Jul 27, 2023

View reviewed changes

syntheseus/reaction_prediction/environments/README.md Outdated Show resolved Hide resolved

syntheseus/reaction_prediction/environments/environment_shared.yml Show resolved Hide resolved

syntheseus/reaction_prediction/environments/gln/Dockerfile Show resolved Hide resolved

AustinT reviewed Jul 27, 2023

View reviewed changes

syntheseus/reaction_prediction/environments/setup_shared.sh Outdated Show resolved Hide resolved

kmaziarz added 2 commits July 31, 2023 14:02

refactor(reaction_prediction): Unify the names of the external model …

a9b05bd

…repos

feat(README): Point to single-step setup instructions

6701b1a

kmaziarz commented Aug 1, 2023

View reviewed changes

README.md Show resolved Hide resolved

kmaziarz added 4 commits August 2, 2023 10:22

chore(environments): Change PYTHONPATH

bb374c7

feat(environments): Mention the CUDA version pin

fc7eeb9

Merge branch 'main' into kmaziarz/release-single-step-environments

28ba718

doc(CHANGELOG): Attach #15 to the entry for #14

deaeb43

kmaziarz merged commit e059159 into main Aug 2, 2023
3 checks passed

kmaziarz deleted the kmaziarz/release-single-step-environments branch August 2, 2023 11:59

This was referenced Aug 15, 2023

Fix environment setup script for MEGAN #20

Merged

Release single-step model checkpoints #21

Merged

kmaziarz added a commit that referenced this pull request Aug 17, 2023

Release single-step model checkpoints (#21)

8879c0e

Continuing from previous PRs (most notably #15), this PR releases trained model checkpoints for all currently integrated baselines.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Release single-step model environments #15

Release single-step model environments #15

kmaziarz commented Jul 26, 2023

fiberleif left a comment

kmaziarz commented Jul 27, 2023

AustinT left a comment

Release single-step model environments #15

Release single-step model environments #15

Conversation

kmaziarz commented Jul 26, 2023

fiberleif left a comment

Choose a reason for hiding this comment

kmaziarz commented Jul 27, 2023

AustinT left a comment

Choose a reason for hiding this comment