Home /Research /Machine Learning for Organic Synthesis: Are Robots Replacing Chemists?
LEARNING

Machine Learning for Organic Synthesis: Are Robots Replacing Chemists?

Boris Maryasin, Philipp Marquetand, Nuno Maulide

Year
2018
Citations
78
Access
Open access

Abstract

Machines learn chemistry: An artificial intelligence algorithm has learned to predict the outcomes of C−N coupling reactions from a few thousand nanomole-scale experiments. This Highlight discusses this work in the context of other state-of-the-art approaches for predicting the yields of organic reactions and explains the significance of the results. The ability to predict the outcome of complex chemical transformations has been a long-standing challenge for chemists. The development of quantum-chemical approaches has already opened some opportunities in this direction, and in many cases, the outcomes of experiments can be efficiently modeled in silico.1-6 The advent of artificial intelligence (AI) algorithms to automatize, improve, and generalize predictions is gaining importance in this field, and several recent studies have been published in this area. For example, in 2016, Aspuru-Guzik and co-workers reported their attempt to apply neural networks to basic reactions of alkenes and alkyl halides, and they were able to identify the correct reaction type for the majority of a set of textbook problems.7 In 2017, Gambin and co-workers tested AI algorithms to predict a large set (450 000 cases) of manifold organic reactions, emphasizing that it might be essential to identify new chemoinformatic descriptors for future developments.8 Among other important attempts to predict and optimize organic reactions on the basis of AI, recent studies by the group of Zare9 as well as Jensen, Green, and co-workers are noteworthy examples.10 Although the predictions had some limitations, in general, the AI algorithms showed an encouragingly good performance even for sophisticated organic systems. A recent study by the groups of Doyle and Dreher11 now demonstrates how the yields of a Buchwald–Hartwig coupling (Scheme 1) with a large set of different substrates can be accurately predicted with an AI algorithm, in this case a so-called random forest. The particularity of the study is that the data from which the algorithm learns are generated experimentally with a nanomole-scale high-throughput robot. The AI predictions substantially outperformed many previous works. Buchwald–Hartwig coupling investigated in the study by the groups of Doyle and Dreher.11 The procedure is as follows: First, the random forest model is trained. Here, molecular properties of the reactants, for example, their vibrational frequencies or dipole moments, are calculated by quantum chemistry. These properties serve as “descriptors”, that is, as inputs for the random forest algorithm. The reaction yield with a given set of reactants is then determined experimentally with the high-throughput robot, and is fed into the machine learning algorithm. The algorithm learns to generate these yields as outputs when provided with the corresponding inputs generated from quantum-chemistry calculations. After this training step, the random forest algorithm is able to predict the reaction yield of other, previously untested reactant combinations, whereby the procedure could be summarized in an oversimplified manner as: “If the reactants feature these vibrational frequencies and these dipole moments, then the reaction yield will be that number.” In this regard, it is interesting to consider that machine learning algorithms (which have been employed for decades) think differently to an experimental organic chemist, who would probably not take properties such as the vibrational spectrum of a reactant or its dipole moment into detailed account to estimate whether a reaction involving that reactant shall result in a high or a low yield. The work of Doyle and Dreher is a very promising breakthrough as they managed to obtain an excellent prediction accuracy, and it opens a range of opportunities for both theoretical and experimental chemists. It holds promise to dramatically accelerate the reaction optimization process in modern organic synthesis. A particularly interesting outcome of the study r

Keywords

RobotComputer scienceArtificial intelligenceHuman–computer interaction

Related papers

Browse all LEARNING papers