Reasonable Expectations, Clean Data, Collaboration: The Three Keys to AI in Drug Discovery
The article explains that while AI and machine learning have long been used in drug discovery, recent hype and massive investments have led to unrealistic expectations, emphasizing that success depends on setting reasonable goals, ensuring clean and abundant data, and fostering collaboration, as most AI-driven drug candidates remain in early development stages and face complex biological and practical challenges.
Despite the buzz around artificial intelligence (AI), most industry insiders know that the use of machine learning (ML) in drug discovery is nothing new. For more than a decade, researchers have used computational techniques for many purposes, such as finding hits, modeling drug-protein interactions, and predicting reaction rates.
What is new is the hype. As AI has taken off in other industries, countless start-ups have emerged promising to transform drug discovery and design with AI-based technologies for things like virtual screening, physics-based biological activity assessment, and drug crystal-structure prediction.
Investors have made huge bets that these start-ups will succeed. Investment reached $13.8 billion in 2020. More than one-third of large-pharma executives report using AI technologies.
While a few “AI-native” candidates are in clinical trials, around 90% remain in discovery or preclinical development, so it will take years to see if the bets pay off.
Artificial Expectations
Along with big investments comes high expectations—drug the undruggable, drastically shorten timelines, virtually eliminate wet lab work. Insider Intelligence projects that discovery costs could be reduced by as much as 70% with AI.
Unfortunately, it’s just not that easy. The complexity of human biology precludes AI from becoming a magic bullet. On top of this, data must be plentiful and clean enough to use. Models must be reliable. Prospective compounds need to be synthesizable. And drugs have to pass real-life safety and efficacy tests.
While this harsh reality hasn’t slowed investment, it has led to fewer companies receiving funding, to devaluations, and to discontinuation of some more lofty programs, like IBM’s Watson AI for drug discovery.
This begs the question: Is AI for drug discovery more hype than hope? Absolutely not. Do we need to adjust our expectations and position for success? Absolutely, yes.
But how?
Three Keys to Implementing AI in Drug Discovery
Implementing AI in drug discovery requires: reasonable expectations, clean data, and collaboration. Let’s take a closer look.
1. Reasonable Expectations
AI can be a valuable part of a company’s larger drug discovery program. But, for now, it’s best thought of as one option in a box of tools. Clarifying when, why, and how AI is used is crucial, albeit challenging.
Interestingly, investment has largely fallen to companies developing small molecules, which lend themselves to AI because they’re relatively simple compared to biologics, and also because there are decades of data upon which to build models. There is also great variance in the ease of applying AI across discovery, with models for early screening and physical-property prediction seemingly easier to implement than those for target prediction and toxicity assessment.
While the potential impact of AI is incredible, we should remember that good things take time. Pharmaceutical Technology recently asked its readers to project how long it might take for AI to reach its peak in drug discovery, and by far, the most common answer was “more than 9 years.”
2. Clean Data
“The main challenge to creating accurate and applicable AI models is that the available experimental data is heterogenous, noisy, and sparse, so appropriate data curation and data collection is of the utmost importance.”
This quote from a 2021 Expert Opinion on Drug Discovery article speaks wonderfully to the importance of collecting clean data. While it refers to ADEMT and activity prediction models, the assertion also holds true in general. AI requires good data, and lots of it.
But good data are hard to come by. Publicly available data can be inadequate, forcing companies to rely on their own experimental data and domain knowledge. Unfortunately, many companies struggle to capture, federate, mine, and prepare their data, perhaps due to skyrocketing data volumes, outdated software, incompatible lab systems, or disconnected research teams. Success with AI will likely elude these companies until they implement technology and workflow processes that let them:
- Facilitate error-free data capture without relying on manual processing
- Handle the volume and variety of data produced by different teams and partners
- Ensure data integrity and standardize data for model readiness
3. Collaboration
Companies hoping to leverage AI need a full view of all their data, not just bits and pieces. This demands a research infrastructure that lets computational and experimental teams collaborate, uniting workflows and sharing data across domains and locations. Careful process and methodology standardization is also needed to ensure that results obtained with the help of AI are repeatable.
Beyond collaboration within organizations, key industry players are also collaborating to help AI reach its full potential, making security and confidentiality key concerns. For example, many large pharmas have partnered with start-ups to help drive their AI efforts. Collaborative initiatives, such as the MELLODDY Project, have formed to help companies leverage pooled data to improve AI models. And vendors like Dotmatics are building AI models using customers’ collective experimental data.
References
- 1.Buvailo, A. Will Biologics Surpass Small Molecules In The Pharmaceutical Race? BiopharmaTrend.com. 2022. https://www.biopharmatrend.com/post/67-will-small-molecules-sustain-pharmaceutical-race-with-biologics/
- 2.Kirkpatrick, P. Artificial intelligence makes a splash in small-molecule drug discovery. Biopharmadealmakers. 2022. https://www.nature.com/articles/d43747-022-00104-7
- 3.Lowe, D. AI and Drug Discovery: Attacking the Right Problems. Science. 2021. https://www.science.org/content/blog-post/ai-and-drug-discovery-attacking-right-problems
- 4.David Z Huang, J. Christian Baber & Sogole Sami Bahmanyar. The challenges of generalizability in artificial intelligence for ADME/Tox endpoint and activity prediction, Expert Opinion on Drug Discovery. 2021, 16(9), 1045-1056. https://www.tandfonline.com/doi/abs/10.1080/17460441.2021.1901685?journalCode=iedc20
Related
Data Evolution in Pharma: The Spread of Multimodal
The pharmaceutical industry is shifting from single-mode to multimodal drug discovery, incorporating diverse therapeutic modalities like biologics, gene therapies, and small molecules, but this evolution presents significant challenges in integrating heterogeneous R&D data and technologies, necessitating advanced, compatible platforms to enable efficient collaboration and leverage AI-driven insights for faster, cost-effective drug development.
3 Customer Trends We’re Watching in 2025
In 2025, life science teams are prioritizing three key trends—Lab-in-a-Loop platforms that integrate instruments, data, workflows, and models to boost R&D efficiency; true multimodal discovery enabled by flexible informatics supporting diverse data types without fragmented tools; and Composite AI leveraging layered, governed, and traceable data across disciplines—all aimed at delivering tangible scientific innovation while addressing resource constraints and stringent AI governance requirements.
Webinar: Three Key Considerations When Implementing AI in Drug Discovery
The webinar, led by Haydn Boehm, focuses on three key considerations—reasonable expectations, clean data, and collaboration—for successfully implementing AI in small molecule drug discovery amidst the current hype and ongoing preclinical development of AI-native candidates.
NBC Louisiana: How AI is Accelerating Drug Discovery
In an NBC Louisiana feature, Dotmatics’ VP Phil Mounteney explained how AI is revolutionizing drug discovery by significantly reducing development time from up to ten years to potentially half, enhancing efficiency in analyzing vast data sets while ensuring data privacy by keeping sensitive information secure and in-house.
NBC Las Vegas: AI is Accelerating Drug Discovery and Vaccine Development
In an NBC Las Vegas interview, Phil Mounteney, VP of Science and Technology at Dotmatics, explains how AI is transforming drug discovery and vaccine development by accelerating research timelines, reducing costs, enhancing drug safety through advanced data analysis, and optimizing development pipelines for faster and more efficient medical breakthroughs post-COVID.
Gray TV Washington DC: AI helping scientists make drug breakthroughs
The Gray TV Washington DC feature highlights how scientists, including Christian Olsen of Dotmatics, are leveraging AI and machine learning to accelerate drug discovery and expedite bringing new drugs to market.