Standard Industries Challenge Puts AI and ML to Work for Chemistry
Will $1 Million Prize Lead to a Breakthrough in Chemical Retrosynthesis?
Grace and our parent company Standard Industries are offering a $1 million prize to revolutionize the field of chemical retrosynthesis. The Standard Industries Chemical Innovation Challenge: Advancing AI-Assisted Molecular Synthesis aims to leverage the power of artificial intelligence (AI) and machine learning (ML) to help transform retrosynthesis, the process by which we create fine chemicals—essential building blocks for everything from pharmaceuticals to food supplements.
Without AI and ML, this process of retrosynthesis, where chemists start with a desired molecule and work backward, -is a time and resource consuming task that relies heavily on manual analysis and literature searching. When considering a problem like this, with thousands to millions of possible solutions, a human expert can focus on only a few that seem plausible – likely missing approaches that may be more efficient, environmentally sustainable or otherwise valuable – making retrosynthesis a problem better suited for a computer, than a human, to tackle.
But sceptics may wonder how (or when) a computer may be able to mimic the “chemical intuition” of a good chemist.
What Is Chemical Intuition?
Chemical intuition refers to a person's innate or learned ability to understand, predict, and explain chemical phenomena based on their knowledge of chemistry principles and patterns. Not just classroom learning, it involves developing a "feel" for how chemicals interact, react, and behave in various conditions, and then using that understanding to make predictions or solve problems in chemistry.
Usually developed through exposure to a wide variety of chemical systems and reactions, chemical intuition gives chemists the authority (and confidence) to make educated guesses in situations where they don't have complete information or data. These guesses will often show up as hypotheses in the chemical innovation efforts that win the funding to progress through the arduous and exacting research and development process.
AI and ML have the potential to jump-start human chemical intuition – and therefore speed up chemical innovation. In the context of retrosynthesis, due to the large number of possible routes to a new chemical solution, AI and ML can increase scientists’ ability to confidently consider larger and more diverse chemical datasets exponentially. The ML model can be trained with very large data sets to predict not-yet-tested conditions in a similar way as scientists use their experience – their chemical intuition – to make similar predictions.
Learning from “Less than Perfect” Data
As you can imagine, the history of all chemical reactions is a voluminous data set. But consider this: on the whole, the published literature provides examples only of reactions that worked – and probably only those that worked in some reaction conditions. Although the amount of data currently available through scientific literature and patent applications seems large, it is vanishingly small in comparison with the number of all the possible organic molecules that could be created – around 166 billion*. Considering all these possibilities is where AI and ML can shine, but only if the systems have access to the data – including information on reactions and conditions that failed to produce the desired outcome.
The outcome of the Challenge could revolutionize our approach to chemical retrosynthesis—and potentially spur groundbreaking advancements. By harnessing AI and ML, we can drive the creation of novel, safe, and cost-effective synthetic routes more quickly and potentially speed up the drug development and materials science cycle for the benefit of humankind.
Follow the Challenge online and keep an eye on our Insights for updates!
*Ruddigkeit, L., Van Deursen, R., Blum, L. C., & Reymond, J. L. (2012). Enumeration of 166 billion organic small molecules in the chemical universe database GDB-17. Journal of Chemical Information and Modeling, 52(11), 2864-2875.