We provide python codes to inverse design BODIPY molecules, as discussed in Ref-1.
DesignBodipy_Bayes.py can be used to design molecules using Bayesian optimization based on Gaussian process regression.
DesignBodipy_GA.py can be used for genetic algoritm (GA) optimization
Both programs use a trained kernel ridge regression machine learning (KRR-ML) model to evaluate the S0→S1 excitation energy.
$ python3 DesignBodipy_Bayes.py <target(eV)>
Additional parameters can be sought using --help
argument. Given below are all possible flags.
Flag/ arg | Description | Default [range] | Compulsory |
---|---|---|---|
[target] | Target S0->S1 value, in eV. Positional argument, non optional. | - | ✓ |
--group, -g | # of substitutions in target BODIPY. | 2 [2, 7] | ✗ |
--data -d | Location of datafiles to be used in KRR ML, contains descriptor and coefficients. | ./data |
✗ |
--restart, -r | # of evaluations for single EI evaluation. More evaluations give more robust minima, with higher computation cost. | 5 [1, ∞] | ✗ |
--exploration, -x | Exploitation vs Exploration parameter | 0.01 (0,100) | ✗ |
--seed, -s | Number of initial evaluations to build Gaussian Process surrogate. More evaluations might help converging faster. | 5 [1, ∞] | ✗ |
--iter, -i | Maximum number of iterations. | 200 [1, ∞] | ✗ |
Once run, it will run for iter
times and print successive improvements towards obtaining target molecule. An example run is shown below:
$ python3 DesignBodipy_Bayes.py 2.7
Searching for 2D BODIPY near 2.700000 eV
Reading ML model from ./data
Iterations 200; Initial evaluations 5
Bayesian opt. parameters:
Exploration/Exploitation param: 0.010000; Eval. per EI: 5
=================================================================
ITER POS GROUPS S0S1(eV) Target
=================================================================
0 1 6 28 27 3.337201 2.700000
1 3 5 29 30 3.184931 2.700000
2 5 6 27 22 3.183506 2.700000
13 4 5 30 25 2.999981 2.700000
18 5 7 23 19 2.952890 2.700000
38 2 5 15 25 2.866237 2.700000
83 5 4 34 6 2.709659 2.700000
=================================================================
$ python3 DesignBodipy_GA.py <target(eV)>
- Python3
- Numpy
- Scipy (scipy.optimize.minimize for iter minimization)
- Scikit-learn (for Gaussian Process)
- MOPAC for calculating minimum energy geometry at the PM7 level
- OBabel for file conversion
- QML for calculating the SLATM descriptor using the PM7 geometry
A publicly accessible web interface hosting a trained machine learning (ML) model to predict S0 → S1 excitation energy of BODIPYs is available at https://moldis.tifrh.res.in/db/bodipy
.
[1] Data-Driven Modeling of S0 -> S1 Transition in the Chemical Space of BODIPYs: High-Throughput Computation, Machine Learning Modeling and Inverse Design,
Amit Gupta, Sabyasachi Chakraborty, Debashree Ghosh, Raghunathan Ramakrishnan
submitted (2021) arxiv
Dataset: https://moldis-group.github.io/BODIPYs/
Dataset DOI: 10.6084/m9.figshare.16529214.v1