Dotmatics

From Data Silos to AI-Ready Science: The Vision Behind Dotmatics Luma

Dotmatics Luma is a new, AI-driven, multimodal scientific platform designed to overcome data silos and fragmented R&D systems by uniting diverse scientific software and streamlining workflows across biologics, small molecule, and materials discovery, enabling smarter, faster, and more collaborative drug discovery and therapeutics development at the enterprise level.

Last year, I met a broad and diverse set of customers who all shared a common need: to help their teams work smarter, faster, and more collaboratively. They want to digitize their labs and create R&D ecosystems that let them leverage their data, streamline workflows, and embrace transformative technologies such as AI. However, obstacles like data silos, correlating multimodal data, navigating specialty areas of science, concerns over interrupting research with software and system updates, and budget cuts often stand in their way. These challenges frequently prompt companies to seek Dotmatics’ help.

Digging deeper reveals an even larger problem: finding off-the-shelf, end-to-end scientific R&D solutions has long felt impossible. Companies must piece together R&D systems that loosely connect the instruments, tools, and applications across their labs. This often leaves research groups operating in isolation, struggling to collaborate, and unable to easily adopt new technologies or fully leverage their R&D data. Over the last few years, we’ve helped address these issues by uniting some of the market’s best-regarded scientific software onto our Dotmatics Platform and delivering solutions to streamline R&D workflows across biologics, small molecule, and chemical and materials discovery.

In 2024, we are setting our sights on challenges at the enterprise level. While we will continue to develop our existing products and solutions for customers of all sizes, there is a specific, urgent need in the R&D space for a truly multimodal scientific platform to drive the next generation of AI-assisted therapeutics. Enter Dotmatics Luma.

Dotmatics Luma—Powering the R&D Ecosystem

After years of responding to the pandemic, 2023 was a year of possibility for the future of drug discovery. The promise of AI was front and center in most industry discussions, and it’s at the core of an exciting new product we unveiled: Dotmatics Luma™. We are building the world’s most powerful multimodal scientific R&D Platform connecting best-of-breed applications to enable collaboration, automation, and analysis to power an AI-assisted future.

Luma is an adaptable, scientifically-aware, data management platform that flexibly and securely aggregates all relevant R&D data into intelligent, correlated data structures. Luma helps power the Dotmatics R&D ecosystem by enabling clean, reliable data analysis and paving the way for meta-analysis and AI/ML-based algorithms.

Launching Luma was a huge leap forward, but it was only the first step. Throughout 2024, we’ll continuously introduce new functionality to better help our customers manage their R&D data, support scientific discovery, and begin to meaningfully incorporate AI. Here’s what’s ahead.

2024 Product Development Focus

Manage R&D Data

Proprietary data is one of a company’s top business assets, and essential to drive R&D efforts forward. A company’s IP will continue to be crucial to its success in a world of AI. The power of an R&D data platform hinges on how easily and reliably users can get data in and out.

  • Instrument integration: Today, over 100 instruments are supported by Luma Lab Connect™, a module that automatically and securely ingests instrument files, parses out descriptive metadata and experiment results, and harmonizes data from instruments in the same class, including LCMS and flow cytometers. Plans are underway to expand these models and parsers for additional types of instruments, including bioreactors, cell analyzers, imagers, plate viewers, and more. Continued performance improvement is also a priority.

  • Open, complete data ingestion: In addition to supporting seamless data ingestion from Dotmatics scientific tools and ELNs, we will continue to make it easier for users to pull in data from any system through interfaces including JDBC, REST, and files. This might include data from animal studies, clinical studies, material registries, Excel files, or anything relevant. Capturing metadata and context for ingested data will continue to be a priority so that wherever and whenever data are accessed and used down the line, they will be complete and traceable.

  • End-user data management tools within a reliable data governance framework: We will continue to support and optimize Luma’s data-governance framework, giving customers precise control over their data. We want scientists and researchers to help define how data are ingested, modeled, stored, processed, and extracted. Administrators will be able to easily fine-tune access so they will know who is accessing what and for what purpose.

Drive Multimodal Science

A key goal for Luma is to drive the multimodal scientific processes involved in making, testing, and analyzing targets and therapeutics within and across our industry-leading scientific tools. Our priorities include:

  • Built-in scientific smarts: We will optimize the native integration between Luma and Dotmatics scientific tools that scientists need to explore, analyze, and innovate–from our ELN and Data Discovery Platform, to our suite of scientific tools, starting with Geneious Prime, Geneious Biologics, GraphPad Prism, and OMIQ. This work will focus on improving data and workflow exchange, as well as integration among the scientific tools themselves to provide a higher degree of interoperability. That will make it easier to do things like register a compound or extract data from one application and send it to another for analysis. These scientific integrations will make both workflows and dataflows more seamless.

  • Multimodal lab workflows: To help users quickly benefit from having their R&D data and tools united in the Dotmatics ecosystem, we plan to deliver a collection of out-of-the-box Dotmatics lab workflows, starting with a customizable task-based multimodal biotherapeutic workflow to support things like siRNA and antibody R&D. These workflows will stitch together different configurations and tools needed to deliver data-centric lab workflows. Over time we will develop different workflows for common, frequently repeated research tasks, automating steps wherever possible. We will also provide capabilities for customers to create their own workflows.

  • Low-code app building: We know that our Dotmatics workflow applications may need to be extended to meet specific user needs, and Luma will provide capabilities to do that. Customers will be able to safely extend out-of-the-box workflows or create new applications with common design patterns using a low-code app-building platform built on Databricks, which seamlessly handles all sorts of data, including unstructured, semi-structured, and fully-structured. A key priority in the continued development of this app-building functionality will be preserving established degrees of governance and control within apps created using the platform. This includes governance and configuration management of code-based functionality, such as SQL and Python; importantly, the Luma platform is low-code, not no-code.

Differentiated AI

With the growing role of AI in R&D, thoughtful inclusion of AI is a big priority for 2024. The potential for AI is even more exciting when working in the context of Luma, where data is AI-ready because it is all together, clean, well-annotated, and harmonized.

Generative AI (GenAI) has created a lot of hype across multiple industries, with meaningful capabilities released at a rapid pace. The first such capability we will pursue is GenAI query-building. For example, someone might want to ask Luma to find all successful protein expressions with CDR3 matching a particular amino acid string, their associated expression backbones, in the last two weeks, and rank by liability type and count. This type of AI-based data querying will empower any user to quickly find data using natural language queries, instead of relying solely on data scientists to define data models, build queries, run SQL, and prepare results.

But this just scratches the surface of what GenAI and large language models (LLMs) will be able to do for the scientific community. Models are only as good as the data sets used to train them. Our core focus for this year is ensuring our customers have the tooling needed to get their proprietary data into a format that is structured and correlated enough to train meaningful models–ones that are optimized and differentiated on their particular innovations and data sets. We want to power a future where our customers are able to combine publicly available and proprietary R&D data to train built-for-purpose models. This could include data which traditionally has been disparate and hard to correlate, such as data related to protein design, wet lab processes, raw assay data, calculated endpoint results, and clinical trials.

Over time, these models will help scientists build more effective AI that could better help spot scientific trends, or suggest targets or therapeutics researchers might want to consider.

How Can We Help?

Whether you're a current customer or someone looking for a solution to your specific challenges, we welcome your thoughts on how we can focus our efforts in 2024 to best support your unique R&D needs.