Introducing the first scientific intelligence platform powered by Databricks
Dotmatics has announced a strategic partnership with Databricks to launch Dotmatics Luma, the first scientific intelligence platform built on Databricks' AI cloud, designed to empower life sciences and biopharmaceutical companies by integrating scientific and data intelligence for enhanced data control, security, and AI-driven drug discovery applications.
Earlier this year I attended the Data + AI Summit produced by data and AI company, Databricks. During his keynote, Databricks CEO Ali Ghodsi revealed that 85 percent of AI use cases still aren’t in production. That's not terribly surprising. We know lots of companies want to leverage the power of AI, but they are lacking tools and expertise to bring them to market as finished products within their respective industries.
Companies need control over their data at its source and the AI models that are running on them to ensure security, privacy compliance and accurate decision making. Effectively managing data can offer an organization a competitive edge, reduce costs and provide greater flexibility when integrating applications.
Advancing Scientific Intelligence in Life Sciences
That's why we have just announced a strategic partnership with Databricks. Dotmatics is one of the very first "built-on Databricks" partners – and our Dotmatics Luma is built on a platform of Databricks, leveraging a modern market-leading AI cloud that is optimally designed for scientific data.
Michael Sanky, Databricks VP of Healthcare & Life Sciences, said, “Dotmatics Luma exemplifies the transformative potential of building on Databricks, and it’s incredible to watch its adoption among the biopharmaceutical community who are excited to harness the full power of their data.”
Databricks recognizes the value that other companies can bring by building tools on top of its ecosystem, and so it’s committed to bringing the best technology solutions to market within every industry; this includes working with partners like Dotmatics who are building the next generation of data-driven applications for life sciences.
What does that mean for our customers? It means using solutions built upon data intelligence.
Blending Scientific Intelligence + Data Intelligence
But Dotmatics goes one step farther — we know that in the realm of drug discovery scientific intelligence is just as essential as data intelligence. For organizations to harness the possibilities of generative AI, any tools must be intimately familiar with science workflows and domain expertise. To bring AI use cases into production more quickly, you need to be able to feed a giant funnel of scientific data into the cloud, make that data AI ready and leverage new AI capabilities offered by advanced data intelligence tools.
Today Dotmatics does that through:
- Luma Platform helps organize the data and provide governed, configured access and data flow between different customer ecosystems.
- Lab Connect serves as a massive data funnel for scientific data into the Luma and Databricks ecosystem, allowing for structured access to hundreds of different types of data from scientific instruments and research tools, including many of our own best-in-class tools.
The result is well organized, secure data ready for Delta Sharing into the customer’s own ecosystem, and ready for AI, BI, or Notebook use cases. Each is a massively challenging task. But today Dotmatics' relationship with Databricks is creating some really exciting possibilities for the future of drug discovery, a few of which we’ll preview here:
Use MLflow to Store Training Runs, Apply Security Permissions
MLflow helps data scientists and engineers manage the process of developing machine learning (ML) models. Think of it as a notebook, toolbox, and showroom combined—all designed to streamline the messy, iterative process of ML. When you're testing different approaches to train your ML model (e.g. changing parameters, algorithms, or data), it keeps track of what you did and the results. This helps you compare experiments and pick the best one. Once you've built a model you like, MLflow helps save it in a standard way so it can be reused or shared with others. The technology makes it easier to put your trained model into action, whether that's in a web app, a batch processing job, or another system. It provides a centralized place for teams to document and share their work, making it easier for others to understand and build upon your progress.
We think our customers will particularly like the ability to apply security and permissions on who may access those models. Plus, it’s super easy to use, requiring one line of code. Basically, any model that’s stored in Unity Catalog can be served right away. One use case that we've been using internally is Chemical Structure Activity prediction. With this method of storing and serving models, it's very easy to run the predictions for every new structure added to the system.
Write SQL Queries to Expedite the R&D Process
Most people know that AI companies have APIs that you can use to leverage their foundation of large language models (LLMs), but Databricks takes it a step further by integrating those models directly into SQL queries. And that can open up new, unexplored possibilities in the R&D pathway to drug discovery.
Imagine that you have 10,000 description fields and you want to know which ones to look into more closely. It would take a few hours to write a script to pull from that database of fields, submit each one to an API endpoint, and then get results back. Instead our customers can use Luma’s dataflows and the SQL-based transformations that are dataflows. You can write a SQL query that asks your AI model to predict something as part of the query's answer. Instead of just pulling data, the query works like: "Hey AI, based on this data, what do you predict?" The result combines your data with the AI's prediction, right in the query.
It’s the difference between typing 10,000 different questions individually into ChatGPT versus asking all 10,000 questions at once and getting the responses very quickly. Dotmatics Luma makes this easy, and this functionality is 100% usable in our platform today.
Enhancing AI with Scientific Context
Similarly, when you are asking a LLM like ChatGPT a question, you can define functions to help it answer more accurately with deeper context. Dotmatics is designing this equivalent for life sciences R&D. We’re developing scientific functions that will allow the AI to execute a number of statistical or scientific functions in the process of answering a question. That includes the ability to give governed, controlled access to your data should you choose to do so. We’re also building the ability to give the AI scientific powers to provide even greater understanding, for example such as statistical calculations, or gating in flow cytometry data.
That could be a real gamechanger. Giving the AI the tools it needs to answer deep, in-field questions about our customers' data is what they need to supercharge their decision-making workflows.
Build Confidence with Operational Monitoring for AI
Operational monitoring is essential for ensuring AI systems run reliably and effectively in real-world environments. It’s what catches issues early, like a data pipeline breaking or a model starting to drift from accuracy. In regulated industries, it’s the key to staying compliant and audit-ready. It also helps build trust—people rely on AI that’s consistent and reliable. Simply put, it’s how AI stays useful, ethical, and effective.
Databricks recently introduced operational monitoring through an interface that lets you see what the AI was “thinking” at different parts of you asking questions. Dotmatics Luma will be able to uncover how that AI was working using the new Databricks functionality. And though not a customer-facing feature at this time, that level of detail is important to know what the AI was trying to do at different points throughout the process. It provides deep observability into whether the AI is getting answers right or wrong, which equates to what you need to actually make a good product.
This is all just the beginning. In the coming year, Dotmatics and Databricks will continue to unlock new opportunities to enable scientific research organizations to stay ahead of the curve in data science, bench science, and advanced AI-driven insights. Together, we’re bridging data and scientific intelligence to drive innovation and accelerate breakthroughs in drug discovery and beyond.
Related
The Data Lifecycle: From Instrument Integration to Advanced Analysis
The article discusses how Dotmatics Luma™, a comprehensive scientific data platform, addresses the challenges faced by R&D teams in managing and integrating diverse, multimodal scientific data from instruments and various sources to streamline the entire data lifecycle—from ingestion and processing to advanced analysis—thereby enabling faster, AI-ready insights that accelerate innovation and research outcomes.
Dotmatics introduces Luma Agent: the AI co-scientist built on structured scientific data
Dotmatics has launched Luma Agent, an AI co-scientist integrated into its Luma Scientific Intelligence Platform, which leverages structured, ontology-backed scientific data to autonomously plan, execute, and manage complex scientific workflows—including data analysis, reporting, and platform configuration—delivering fully traceable, verifiable, and reproducible results with human-approved actions to accelerate lab work from days to minutes while ensuring governance and accountability.
How Luma Lab Connect Automates Lab Data Acquisition Across 100+ Instruments
Dotmatics Luma Lab Connect, part of the Dotmatics Luma multimodal scientific R&D platform, automates and streamlines the acquisition, management, and preparation of complex, multimodal lab data from over 100 diverse instruments and sources, addressing challenges of data security, integrity, and usability to enhance research productivity and enable FAIR data practices within a unified, low-code SaaS environment.
Harnessing Data and AI for Scientific R&D
The article explains that successful AI-driven life sciences R&D in 2025 hinges on trusted, well-governed, multimodal data integrated into automated, workflow-embedded AI tools, requiring collaboration between data and scientific intelligence—as exemplified by Dotmatics and Databricks’ partnership enabling low-code scientific apps and flexible workflows to overcome data silos and improve data quality, consistency, and usability for accelerated research.
From Data Silos to AI-Ready Science: The Vision Behind Dotmatics Luma
Dotmatics Luma is a new, AI-driven, multimodal scientific platform designed to overcome data silos and fragmented R&D systems by uniting diverse scientific software and streamlining workflows across biologics, small molecule, and materials discovery, enabling smarter, faster, and more collaborative drug discovery and therapeutics development at the enterprise level.
AI-Enabled Scientific Discovery with Dotmatics Luma
Dotmatics Luma is an AI-enabled scientific intelligence platform integrated with their ELN that leverages neural network models trained on both private customer data and large public compound libraries to provide chemists with predictive insights, property calculations, and novel compound identification, thereby accelerating drug discovery and enhancing research efficiency.