For nearly two decades, join a reliable event by Enterprise leaders. The VB transform brings people together with real venture AI strategy together. learn more
Many enterprises AI agent development efforts never make it for production and it is not because technology is not ready. Problem according to DatabricIt is that companies are still relying on manual evaluation with a process that is slow, inconsistent and difficult on scale.
Today at the Data + AI Summit, Databricics launched the Mosaic agent BRICS as a solution to that challenge. The technology manufactures the company’s mosaic AI agent framework declared in 2024.
The mosaic agent automatic the agent using a range of BRICS platform research-supported innovations. Major innovations have integration of TAO (test-time adaptive adaptation), which provides a novel approach for AI tuning without the need for labeled data. Mosaic agents bricks also produce domain-specific synthetic data, form task -ware benchmarks and optimize quality-to balance without manual intervention.
Fundamentally the goal of the new platform is to solve an issue that databricics users had the current AI agent development efforts.
“They were blind flying, they had no way to evaluate these agents,” Hanlin Tang, chief technology officer of neural networks, told venturebeat. “Most of them were relying on a kind of manual, manual vibe tracking to see if the agent looks quite good, but it does not give them the confidence to go into production.”
From research innovation to enterprise AI production scale
Tang was the first mosaic co-founder and CTO, which was acquired in $ 1.3 billion by Databrix in 2023.
In mosaic, most parts of research innovation are not necessarily an immediate enterprise effect. All this changed after the acquisition.
Tang said, “There was a big light bulb moment for me when we first launched our product on Databricics, and immediately, overnight, we had thousands of venture like customers,” Tang said.
In contrast, before the acquisition, the mosaic products will spend months in an attempt to achieve a handful of enterprises. The integration of mosaic in databricics has given a direct access to the research team of Mosaic for enterprise problems and detected to detect new areas.
This venture contact revealed new research opportunities.
“This only happens when you contact enterprise customers, you work deeply with them, that you really highlight interesting research problems to go,” Tang explained. “Agent bricks … .IS, in some ways, a development of everything we are working on mosaic, now we are all completely, completely burials.”
Solving Agent AI Evaluation Crisis
Enterprise teams face an expensive test and error optimization process. Without task -ware benchmark or domain-specific testing data, each agent adjustment becomes an expensive guessing game. Follow quality drifts, cost overran and missed deadline.
Agent bricks automate the entire adaptation pipeline. The platform takes a high-level work details and enterprise data. It automatically handles the rest.
First, it generates working-specific evaluation and LLM judges. Next, it creates synthetic data that reflects customer data. Finally, it discovers in customization techniques to find the best configuration.
“The customer describes the problem at a higher level and they do not go into the low level details, as we take care of them,” Tang said. “The system produces synthetic data and manufactures specific LLM judges for each task.”
The platform provides four agent configuration:
- Information extraction: Converts documents (PDF, email) into structured data. The case of a use can be retail outfits that use it to draw product details from suppliers PDF, even with complex formatting.
- Knowledge assistant: Enterprise provides accurate, quoted reply from data. For example, manufacturing technicians can get immediate answers from maintenance manual without digging through binders.
- Custom LLM: The text takes over the change of change (summary, classification). For example, healthcare organizations can customize models that abbreviaries the patient’s notes for clinical workflows.
- Multi-agent supervisor: Orchestrates many agents for complex workflows. An use case example is a financial services firm that can coordinate agents for detection, document recovering and compliance checks.
Agents are great, but don’t forget about data
The manufacture and evaluation of agents is a main part of preparing AI Enterprise, but it is not the only part that is required.
The databrix mosaic agent keeps the bricks as the consumption layer of AI, which is sitting on top of its integrated data stack. At the Data + AI Summit, Databrix also announced the general availability of its Lakeflow Data Engineering Platform, which was first previewed in 2024.
Lakeflow data preparation solves the challenge. This united three important data engineering trips that were previously required by different devices. The ingestion handles obtain both structured and unnecessary data in databricks. The change provides efficient data cleaning, resurrection and preparation. Orquilling production manages workflow and scheduling.
The workflow connection is direct: Lakeflow prepares the enterprise data through integrated ingestion and change, then the agent brick forms the AI agents adapted to that finished data.
Bilal Aslam, senior director of the product management of Databrix, said, “We help receive data in the platform, and then you can do ML, BI and AI Analytics.”
On being beyond data ingestion, mosaic agents bricks also benefit from the unity catalog’s governance facilities. This includes access control and data lineage tracking. This integration ensures that the agent behavior respects the enterprise data regime without additional configuration.
Human reaction ends accelerated stuffing by learning agents
Today, one of the general approaches to guide AI agents is to use a system prompt. Tang referred to the practice of ‘Prompt Stuffing’, where users shake all types of guidance in a hint in an indication that the agent will follow it.
Agent bricks introduce a new concept that is called agent learning from human response. This feature automatically adjusts system components based on natural language guidance. It solves what tang says rapid filler problem. According to Tang, early stuffing approach often fails because the agent system has several components that require adjustment.
Agent learning from human response is a system that automatically explains natural language guidance and adjusts the appropriate system components. The approach indicates learning reinforcement from human response (RLHF), but individual models operate at the system level rather than the weight.
The system handles two main challenges. First, natural language guidance may be unclear. For example, what does ‘respect the voice of your brand’ really mean? Second, the agent system consists of several configuration points. Teams struggle to identify which components require adjustment.
The system eliminates the estimate of which agent components require adjustment for specific behavioral changes.
“This we believe that the agents will help become more steerable,” Tang said.
Technical benefits on existing outlines
Today there is no shortage of agent AI development structures and equipment in the market. The growing list of vendors options includes language, microsoft and Google tools.
Tang argued that the mosaic agent that makes bricks different is adaptation. Instead of the need for manual configuration and tuning, many research techniques are automatically incorporated in agent bricks: TAO, in-contact learning, prompt optimization and fine-tuning.
When it comes to agent communication for the agent, there are some options in the market today, including the agent2Gent protocol of Google. According to Tang, Databricics is currently discovering various agent protocols and is not committed to the same standard.
Currently, Agent BRICS handles agent-to-agent communication through two primary methods:
- To highlight agents in the form of closing points wrapped in various protocols.
- Using a multi-agent supervisor which is MCP (Model Reference Protocol).
Strategic implications for enterprise decision makers
For enterprises leading the path in AI, it is important to have the right techniques to evaluate quality and effectiveness.
Deploying agents without assessment is not going to lead to an optimal result nor will there be agents without a solid data foundation. When considering agent development technologies, it is important to have a proper mechanism to evaluate the best options.
Human feedback perspective is also remarkable for agent Learning Enterprise decision makers as it helps the agent AI to guide the best results.
For enterprises looking to lead in AI agent fines, this development means that evaluation infrastructure is no longer a blocked factor. Organizations can focus on resources on the identity of the case and the use of data preparation rather than the creation of adaptation framework.