5 Benefits Of Leveraging Machine Learning In Your Process Development Workflows

The pressure to accelerate timelines sits at the heart of every drug development program. The faster sponsors identify solutions to early-stage challenges, the earlier they can establish a scalable drug manufacturing process, file their investigational new drug application (IND), and reach patients in the clinic. One primary objective in this early phase is to develop and optimize a safe and scalable synthetic route while maintaining consistent purity control.

As artificial intelligence (AI) and machine learning (ML) become part of the drug manufacturing toolkit, sponsors and contract development and manufacturing organizations (CDMOs) are exploring how these technologies can improve process chemistry and generate scalable synthetic routes more efficiently. With AI and ML capabilities at their fingertips, sponsors can initiate faster pathways to patients while improving quality, reducing costs, and increasing product yield. Here are five ways ML advances early-stage drug development.

1. Accelerate Experimentation to Reduce Timelines

In a typical process development (PD) setting, chemists and engineers conduct experiments to assess which parameters ensure safety and impurity control, often changing variables one at a time to determine impact on critical quality attributes (CQAs). ML models, including random forest models, can jump start early-stage PD by analyzing large and small data sets and virtually screening multiple process parameters simultaneously.

For example, an ML model can help reveal a scalable synthetic route and optimal process parameters by identifying variables and recommending experiments that provide better control for the objectives, including:

Reaction yields
Selectivity
Impurity levels
Reaction conversions
Other requested outputs

As a result, fewer physical experiments are required in PD, saving time and enhancing efficiency.

At Grace Fine Chemical Manufacturing Services, teams are activating ML models, like sequential learning, to help improve PD. In this model, Grace chemists and engineers create the model and generate a search space and objectives, and the ML platform produces a list of experiments to run, some of which would not be expected to achieve the desired values. This “suboptimal” data can be very useful to train the algorithm. Users then re-ingest the data to the platform, which proposes a second round of experiments based on new information. With each round of sequential learning, the results become more aligned with predicted values, sometimes even exceeding expectations relative to predicted outcomes. Users can fine tune which experiments to run next from the platform’s suggestions to further dial in process conditions.

2. Enhance Understanding and Insight Generation

ML algorithms can recognize complex interactions and patterns between variables that are difficult to discern with traditional experimental design and statistical analysis. This includes providing insight into the mechanism of formation for low level impurities and tracking factors that influence yield. When ML techniques are implemented, they can provide new insights into areas for optimization. For example, feature importance analysis can highlight the most influential parameters for your model², while Shapley Additive Explanation (SHAP) values, which are based on cooperative game theory, can be used to increase transparency and interpretability.³

In a typical Design of Experiment (DoE) workflow, scientists run a set of experiments and input the results into software that illuminates which variables are important. While DoE remains a crucial tool across PD, ML adds further insight to enhance DoE findings. In one DoE study, Grace chemists anticipated that agitation would be identified as an important process variable, but the DoE software did not flag it as being important. However, when the data was input into an ML platform, the ML algorithm uncovered conflating variables and correctly assessed agitation as an important process variable.

ML models are trained to predict product quality attributes based on process parameters such as temperature, amount and identity of raw materials, and hold times. This helps establish control strategies to maintain specified CQAs, even if there is process variability. Grace teams design ML models to adhere to and account for a client’s specific unit operation parameters and outputs, yield measurement, and known impurities.

For chemists and engineers whose goal is to understand every aspect of a highly complex process, ML provides enhanced insights on data that may not be well understood. For example, if teams are struggling to determine why yields are lower than anticipated, ML can be implemented to analyze trends and assess the cause of lower yields. With this insight, PD teams can adapt their process to improve the existing route of synthesis.

3. Improve Analytical Method Development

One common delay in PD is traditional method development, because it can take months to develop a robust method that sufficiently detects all impurities. ML is accelerating this activity by identifying key parameters that contribute to peak separation, intensity and shape. The algorithm suggests improvements that rapidly cut down on the time to identify optimal instrument parameters.

As one example, Grace scientists used ML to refine gas chromatography (GC) methodology by distilling large volumes of analytical data and facilitating the rapid optimization of parameters like peak resolution and detection limits to make the method more precise and reliable. In this instance, an analytical R&D chemist spent six weeks developing a GC method that demonstrated a suboptimal peak resolution within the targeted specification. From there, ML was deployed to further optimize the method and improve the peak resolution in less than a week, significantly reducing the amount of time needed to produce a commercial-level method.

When robust analytical methods are installed faster, early-stage experiments will be run with the same analytical tools as the resulting commercial processes, enabling data synergy and mitigating risk across the full product lifecycle. Additionally, rapid analytical method development enables further reduction in process development timelines, as process-specific ML models typically require fewer rounds of sequential learning to identify meaningful trends when the input data has a higher signal-to-noise ratio.

4. Optimize Resource Utilization to Improve Productivity and Project Timelines

ML can enhance productivity and efficiency efforts by identifying new ways to reduce raw material and energy consumption. Users input objectives for solvent and raw material reduction that translate to cost savings for both purchasing and waste disposal. Similarly, ML can be applied to identify the conditions that shorten reaction times or lower reaction temperatures, both of which can reduce energy consumption.

5. Streamline Scale-up and Technology Transfer

Finally, ML models trained on data from pilot-scale experiments can be used to predict process performance at full production scale, reducing the risk of unexpected scale-up problems and facilitating technology transfer between manufacturing sites. The implementation of AI and ML to conduct real-time system monitoring can help ensure that transferred processes run consistently and accurately.⁵ Opting to implement automated technology wherever possible will enhance consistency and standardization while reducing the risk of human error.⁵

With high-quality, reliable data sets from a sponsor, ML and AI can be instituted to rapidly scan and summarize sponsor documents in preparation for tech transfer to a CDMO. Eventually, AI and ML might even provide risk assessment and regulatory recommendations to support compliance throughout tech transfer.⁵

ML Enables Teams to Work Smarter Not Harder

Though there are ample possibilities for ML implementation in PD, there is also the fear that this technology will replace jobs across the drug development and manufacturing space. To the contrary, despite its ability to provide insight and accelerate experimentation, successful ML models require the expert interpretation, design, and validation of chemists and engineers. The need for chemical intuition and experience remains essential; ML and AI should enhance the work of these specialists and free up space for creative problem solving. What’s more, some processes are inherently difficult to model, making the work of expert teams and existing PD tools (e.g., DoE) nothing short of critical.

Grace’s ML models provide each predicted attribute with an R-squared (R²) score, a quantity that reflects how well independent variables in a statistical model explain variation in the dependent variable.⁴If the correct parameters are input into a model, there should be alignment between a model’s predicted values and an experiment’s actual values. Lack of alignment indicates that a parameter is missing or that there is too much noise in the data. Grace’s ML model also flags individual outlier experiments which have radically different attributes than would be predicted by the model with the remaining data set. This allows chemists and engineers to evaluate the quality of a given data set and to address both systemic and special cause issues more efficiently.

Indeed, ML is limited by the quality and quantity of the data it has access to. More data does not necessarily equal better results, especially if data is biased, noisy, or irrelevant, all of which contribute to misleading results when coupled with the high sensitivity of initial ML models. Data cleaning, curation, and validation are essential to the accuracy of ML models.

The enhancements ML and AI bring to drug manufacturing have only started to be realized. To maximize the impact of ML and AI on your early drug development, work with a CDMO that is already enacting these technologies to improve timelines, ensure reliable processes, and manufacture a high-quality product. Collaborating with a partner committed to ML and AI will help accelerate process development as your team strives toward clinical success and beyond.

Interested in exploring how Grace is leveraging ML across early PD workflows? Go here to learn more.

Download the whitepaper

Featured Service

Fine Chemicals

Grace FCMS is a premier CDMO offering RSMs, intermediates, and high-value APIs across our network of U.S. facilities. We also offer a portfolio of generic APIs.

Learn More

References

Sanchez-Lengeling, et al. (2021). A Gentle Introduction to Graph Neural Networks. Distill. https://distill.pub/2021/gnn-intro/
Shin, T. (2024, November 7). Understanding Feature Importance in Machine Learning. Built In. https://builtin.com/data-science/feature-importance
Trevisan, V. (2022, January 17). Using SHAP Values to Explain How Your Machine Learning Model Works. Towards Data Science. https://towardsdatascience.com/using-shap-values-to-explain-how-your-machine-learning-model-works-732b3f40e137/
Fernando, J. (2024, November 13). R-Squared: Definition, Calculation, and Interpretation. Investopedia. https://www.investopedia.com/terms/r/r-squared.asp
Nahum, S. (2023, May 5). Your Next Tech Transfer Should Include These Modern Tools. Bioprocess Online. https://www.bioprocessonline.com/doc/your-next-tech-transfer-should-include-these-modern-tools-0001