- ZINC: 736M Compounds.
- ChEMBL: 23M Compounds, 197M Activities.
- QM9: 133,885 Compounds.
- PubChem: 112M Compounds, 296M Bioactivities.
> SMILES-based
[GVAE] Grammar variational autoencoder.(ICML 2017) [Paper] [Code]
[chemical VAE] Automatic chemical design using a data-driven continuous representation of molecules (ACS Cent. Sci. 2018) [Paper] [Code]
[SD-VAE] Syntax-directed variational autoencoder for molecule generation (ICLR 2018) [Paper] [Code]
> Graph-based
[GraphVAE] Graphvae: Towards generation of small graphs using variational autoencoders.(ICANN 2018) [Paper]
[CGVAE] Constrained graph variational autoencoders for molecule design (NeurIPS 2018) [Paper] [Code]
[JT-VAE] Junction tree variational autoencoder for molecular graph generation (ICML 2018) [Paper] [Code]
[NEVAE] NEVAE: A deep generative model for molecular graphs (AAAI 2020) [Paper] [Code]
> SMILES-based
[Segler's model] Generating focused molecule libraries for drug discovery with recurrent neural networks (ACS Cent. Sci. 2018) [Paper]
[CLM] Generative molecular design in low data regimes. (Nat. Mach. Intell. 2020)[Paper] [Code]
[DDC] Direct steering of de novo molecular generation with descriptor conditional recurrent neural networks. (Nat. Mach. Intell. 2020) [Paper] [Code]
[BIMODAL] Bidirectional molecule generation with recurrent neural networks. (J. Chem. Inf. Model. 2020) [Paper] [Code]
[Scaffold Decorator] SMILES-based deep generative scaffold decorator for de-novo drug design. (J. Cheminform. 2020) [Paper] [Code]
> Graph-based
[Li's model] Learning deep generative models of graphs (ICLR 2018) [Paper]
[Li's model] Multi-objective de novo drug design with conditional graph generative model (J. Cheminform. 2018) [Paper] [Code]
[MolecularRNN] MolecularRNN: Generating realistic molecular graphs with optimized properties (arXiv 2019) [Paper]
> SMILES-based
[ORGAN] Objective-reinforced generative adversarial networks (ORGAN) for sequence generation models (arXiv 2017) [Paper] [Code]
[ORGANIC] Optimizing distributions over molecular space. An objective reinforced generative adversarial network for inverse-design chemistry (ORGANIC) (ChemRxiv 2017) [Paper] [Code]
[latentGAN] A de novo molecular generation method using latent vector based generative adversarial network. (J. Cheminform. 2019) [Paper] [Code]
[stacked GAN] De novo generation of hit-like molecules from gene expression signatures using artificial intelligence. (Nat. Commun. 2020) [Paper]
> Graph-based
[MolGAN] MolGAN: An implicit generative model for smallmolecular graphs (ICML workshop 2018) [Paper] [Code]
[Mol-CycleGAN] Mol-CycleGAN: a generative model for molecular optimization (J. Cheminform. 2020) [Paper] [Code]
[GraphNVP] GraphNVP: An invertible flow model for generating molecular graphs (arXiv 2019) [Paper] [Code]
[GRF] Graph residual flow for molecular graph generation (arXiv 2019) [Paper]
[GraphAF] GraphAF: a flow-based autoregressive model for molecular graph generation (ICLR 2020) [Paper] [Code]
[MoFlow] MoFlow: an invertible flow model for generating molecular graphs (KDD 2020) [Paper] [Code]
[MolGrow] MolGrow: A graph normalizing flow for hierarchical molecular generation (AAAI 2021) [Paper]
> String-based
[Nigam's model] Augmenting genetic algorithms with deep neural networks for exploring the chemical space (ICLR 2020) [Paper] [Code]
[STONED] Beyond generative models: superfast traversal, optimization, novelty, exploration and discovery (STONED) algorithm for molecules using SELFIES (Chem. Sci. 2021) [Paper] [Code]
> Graph-based
[GB-GA] A graph-based genetic algorithm and generative model/monte carlo tree search for the exploration of chemical space (Chem. Sci. 2019) [Paper] [Code]
[GB-GM-MCTS] A graph-based genetic algorithm and generative model/monte carlo tree search for the exploration of chemical space (Chem. Sci. 2019) [Paper] [Code]
[BOSS] BOSS: Bayesian optimization over string spaces (Advances in neural information processing systems 2020) [Paper] [Code]
[GPBO] A fresh look at de novo molecular design benchmarks (NeurIPS Workshop 2021) [Paper] [Code]
[MARS] MARS: Markov molecular sampling for multi-objective drug discovery (ICLR 2021) [Paper] [Code]
[MIMOSA] MIMOSA: Multi-constraint molecule sampling for molecule optimization (AAAI 2021) [Paper] [Code]
> String-based
[REINVENT] Molecular de-novo design through deep reinforcement learning (J. Cheminform. 2017) [Paper] [Code]
> Graph-based
[MolDQN] Optimization of molecules via deep reinforcement learning (Scientific reports 2019) [Paper] [Code]
[GCPN] Graph convolutional policy network for goal-directed molecular graph generation (NeurIPS 2018) [Paper] [Code]
[RationaleRL] Multi-objective molecule generation using interpretable substructures (ICML 2020) [Paper] [Code]
[FREED] Hit and lead discovery with explorative rl and fragment-based molecule generation (NeurIPS 2021) [Paper] [Code]
[Pasithea] Deep molecular dreaming: Inverse machine learning for de-novo molecular design and interpretability with surjective representations (Mach. Learn.: Sci. Technol. 2021) [Paper] [Code]
[DST] Differentiable scaffolding tree for molecular optimization (ICLR 2022) [Paper]
GuacaMol: Benchmarking Models for de Novo Molecular Design (J. Chem. Inf. Model. 2019) [Paper] [Code]
Sample Efficiency Matters: A Benchmark for Practical Molecular Optimization (arXiv 2022) [Paper]