This is a non-exhaustive list of methods for creating machine learning algorithms for yield prediction.
Found this dataset from a recent publication by a D.M. Makarov et al. (Journal of Computational Science 74 (2023) 102173) (https://doi.org/10.1016/j.jocs.2023.102173)
"""We considered pyrrole or dipyrromethane condensation reactions with various aldehydes, resulting in the production of boron(III) dipyrromethene or BODIPY (681 records). These reactions were retrieved from articles (see “Dataset reactions” and Scheme S1). Addi tionally, we used the reactions of the production of dipyrromethane (111 records) and porphyrins (457 records). All condensation reactions for dipyrromethanes and 213 reactions for porphyrins with various al dehydes were obtained in our laboratory. The remaining 244 reactions for the porphyrins synthesis were obtained from articles (see “Dataset reactions”). Our experimental dataset is based on a study of pyrrole condensation processes with aldehydes, using catalytic amounts of organic acids to produce ms-aryl- and ß-alkyl-substituted dipyrro methanes. """
The objective of this repository is to introduce methods for predicting yield
As a beginner, there may be numerous opportunities for improvement in this notebook. I was largely inspired by :
the work of D.M. Makarov et al,
the STEPHEN LEE's notebook (BELKA: Molecule Representations for ML Tutorial),
the VLAD SKIN's notebook (Tutorial ML In Chemistry Research. RDkit & mol2vec),
the work of Alexander A. Ksenofontov and co (10.1016/j.jocs.2023.102173),
the work of Jean-Louis Reymond and co (10.1039/d1dd00006c),
the André OLIVEIRA's notebook (Predicting molecule properties based on its SMILES).
thanks to them !!