Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release single-step model environments #15

Merged
merged 13 commits into from
Aug 2, 2023

Conversation

kmaziarz
Copy link
Contributor

This PR releases instructions for building an appropriate environment for each of the supported single-step models. I've managed to unify all models apart from GLN into a shared base (containing mostly just python, torch and rdkit) which each model type extends with a handful of model-specific dependencies. One last piece still missing are checkpoint files for each model type.

For now, the setup instructions live under reaction_prediction/environments. Longer term we could consider forking the external single-step model repositories and adding simple pip-installability to them, which could allow us to streamline the setup steps further through installation extras, e.g. be able to do something like pip install syntheseus[retro-knn]. Currently the dependencies in the environments are mostly pinned for reproducibility, but if we go with the installation extras approach then we can unpin them wherever possible.

@kmaziarz kmaziarz requested a review from AustinT July 26, 2023 18:36
Copy link
Contributor

@fiberleif fiberleif left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Just one question, l noticed that there is an environment.yml file in the root directory of Syntheseus. When should users install this file, and when should users install the environment_shared.yml file located in reaction_prediction/environments directory ?

Some clarifications in the Readme might be helpful?

@kmaziarz
Copy link
Contributor Author

l noticed that there is an environment.yml file in the root directory of Syntheseus. When should users install this file, and when should users install the environment_shared.yml file located in reaction_prediction/environments directory ?

When trying to set up just the core of syntheseus (to e.g. plug in your own single-step models) you'd use the environment from the top-level, while if trying to plug in specific single step models, you'd follow the instructions from reaction_prediction/environments (most of which start with using the environment_shared.yml file but not GLN).

We should think how/where to best explain this to the users though (perhaps easier when we also add the model checkpoints).

Copy link
Collaborator

@AustinT AustinT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Only suggestion is updating top-level README.

README.md Show resolved Hide resolved
@kmaziarz kmaziarz merged commit e059159 into main Aug 2, 2023
3 checks passed
@kmaziarz kmaziarz deleted the kmaziarz/release-single-step-environments branch August 2, 2023 11:59
kmaziarz added a commit that referenced this pull request Aug 15, 2023
There seem to be two issues with `setup_megan.sh` that evaded the
testing done in #15:
- Two necessary dependencies (`gitpython` and `scipy`) are missing
- The version of `torchtext` used forces `pip` to switch to a different
version of `torch` than the one used in the shared environment

This PR addresses both problems.
kmaziarz added a commit that referenced this pull request Aug 17, 2023
Continuing from previous PRs (most notably #15), this PR releases
trained model checkpoints for all currently integrated baselines.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants