Want smart insight into your inbox? Enterprise AI, only what matters to data and security leaders, sign up for our weekly newspapers. Subscribe now
A new technology from Ghejiang university And Alibaba Group The large language model (LLM) gives agents a dynamic memory, which makes them more efficient and effective in complex tasks. Technology is called MemeAn agent provides an “procedural memory” that is constantly updated because they get experiences, such as how humans learn from practice.
Memp makes a lifetime learning outline where agents do not have to start with scratches for every new work. Instead, they become progressively better and more efficient because they face new conditions in the real -world environment, which is a significant requirement for reliable enterprise automation.
Case for procedural memory in AI agents
LLM agents promise to automate complex, multi-step business processes. In practice, however, these long-term functions can be fragile. Researchers say that unexpected events such as network glitch, user interface change or shifting data skima can derail the entire process. For current agents, it often means every time, which can be time consuming and expensive.
Meanwhile, many complex functions, despite the surface difference, share deep structural generalities. Instead of resumping these patterns every time, an agent should be able to remove and reuse his experience from previous successes and failures, the researchers explain. This requires a specific “procedural memory”, which is a long -term memory responsible for skills such as typing or riding a bike in humans, which automatically becomes automated with practice.
AI scaling hits its boundaries
Power caps, rising token costs, and entrance delays are re -shaping Enterprise AI. Join our exclusive salons to learn about top teams:
- Transform energy into a strategic profit
- Architecting efficient estimates for real thrruput benefits
- Unlocking competitive ROI with sustainable AI system
Secure your location to stay ahead,

The current agent system often lacks this capacity. Their procedural knowledge is usually prepared by hand -prepared by developers, stored in rigid quick templates or embedded within the parameters of the model, which are expensive and slow to update. Even existing memory-AUGMMMMMMMERTED Frames provide only thick abstractions and do not sufficiently address how skills should be constructed, should be indexed, should be corrected and eventually trimmed to the life cycle of an agent.
As a result, researchers note Their paper“How efficiently an agent an agent develops a list of its procedural performance or guarantees that new experiences improve instead of improving new experiences.”
How does mep works
Memp is a working-unquisitionist structure that treats procedural memory as a main component to be customized. It consists of three major stages that work in a constant loop: construction, recovering and updating memory.
Memories are made from the previous experiences of an agent, or “trajectory”. Researchers discovered to store these memories in two formats: words, step-by-step actions; Or disturbing these tasks in a high-level, script-like abstraction. For recovery, the agent discovers its memory for the most relevant past experience when given a new task. The team experimented with different methods, such as vector discovery, to match the details of new tasks for previous questions or to find the best fit to find the keyword.
The most important component is the update mechanism. MEMP introduces several strategies to agent’s memory to develop. As an agent completes more tasks, its memory can be updated by adding only new experiences to only successful results or, most effectively, by reflecting failures in correcting and modifying the original memory.

It focuses on dynamic, developed memory locations, which is MEMP within the growing field of research with the aim of making AI agents more reliable for long -term tasks. The function gives equality to other efforts, such as MEM0, which consolidates important information in the facts and knowledge of knowledge from long interactions to ensure stability. Similarly, the A-MEMs are able to make and connect agents to make and connect “memory notes” with their interactions, which creates a complex knowledge structure over time.
However, co-writer Runan Fang highlighted a significant difference between MEMP and other framework.
“MEM0 and A-Mem are excellent tasks … but they focus on remembering the main materials Inside A single trajectory or conversation, “Fang commented for venturebeat. In short, they help to remember an agent” what happened “. The MEMP, on the contrary, targets the cross-triotic procedural memory. “It focuses on” How-to “knowledge that can be normalized in equal functions, preventing the agent from re-discovering from scratches.
Fang said, “MEMP enhances success rates and short steps by distilling the previous successful workflows into pro -procedural priests.” “Importantly, we also introduce an updated mechanism so that it improves procedural memory- after all, also makes correct for practice agents.”
Overcoming ‘cold-start’ problem
While the concept of learning from previous projections is powerful, it raises a practical question: how does an agent build its initial memory, since there is no correct example for learning? Researchers address this “cold-start” problem with a practical approach.
Fang explained that Devs may first define a strong evaluation metric rather than the requirement of a correct “gold” trajectory upfront. This metric, which can be the rule-based or even another LLM, scores the quality of the performance of an agent. “Once the metric is in place, we allow the state -of -the -art model to find within the agent workflow and maintain the projections receiving the highest score,” Fang said. This process makes a initial set bootstrapy of rapidly useful memories, allowing a new agent to gain momentum without comprehensive manual programming.
MEMP in action
To test the framework, the team applied Memp on top of powerful LLM such as GPT-4o, Cloud 3.5 Sonnet and Qwen2.5, evaluating them on complex functions such as domestic work in the alforld benchmark and informing in the travelplanler. The results showed that by building and recovering procedural memory, an agent was allowed to use its previous experience effectively and reuse.
During the test, MEMP -equipped agents not only achieved high success rates, but became very efficient. They eliminated fruitless exploration and testing-and-testing, causing adequate decrease in both stages and consumption of tokens required to complete a task.

One of the most important conclusions for enterprise applications is that procedural memory is transferable. In an experiment, procedural memory generated by powerful GPT-4O was given to a very small model, Qwen2.5–14B. The small model saw a significant increase in the performance, improved its success rate and reduced the steps required to complete the tasks.
According to Fang, it works because small models often handle simple, single-step actions well, but when it comes to long-term planning and logic. Processive memory from large models effectively fills this capacity difference. This suggests that knowledge can be achieved using a state -of -the -art model, then deployed on a small, more cost -effective model without losing the benefits of that experience.
Really on autonomous agents
By equipping agents with memory-update mechanism, MEMP framework allows them to constantly build and refine their procedural knowledge while working in a live environment. Researchers found this agent endowed with “almost consistent, almost linear mastery”.
However, complete autonomy requires the path to overcome another obstruction: many real -world functions, such as building a research report, lacks a simple success signal. To continuously improve, an agent must know if he has done a good job. Fang says that the future lies in using LLM as judges.
“Today we often combine powerful models with hand -designed rules to calculate a full score,” he notes. “It works, but hand -written rules are brittle and difficult to normalize.”
A llm-ac-judge can provide the fine, supervisory response required to an agent to self-recover on subjective functions. This will make the entire learning loop more scalable and strong, which will mark a significant step towards the construction of flexible, adaptable and actually autonomous AI workers required for sophisticated enterprise automation.