Close Menu
Pineapples Update –Pineapples Update –

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    Upgrading your office? 12+ Accessories That Turned My Laptop Into the Ultimate Work Machine

    November 8, 2025

    Amazon is selling the M4 MacBook Air at its lowest price ever – and it’s an easy buy for me

    November 8, 2025

    Need a sleep study? It may soon be as easy as downloading an Apple Watch app

    November 8, 2025
    Facebook X (Twitter) Instagram
    Facebook X (Twitter) Instagram Pinterest Vimeo
    Pineapples Update –Pineapples Update –
    • Home
    • Gaming
    • Gadgets
    • Startups
    • Security
    • How-To
    • AI/ML
    • Apps
    • Web3
    Pineapples Update –Pineapples Update –
    Home»AI/ML»EAGLET boosts AI agent performance on long-range tasks by creating custom plans
    AI/ML

    EAGLET boosts AI agent performance on long-range tasks by creating custom plans

    PineapplesUpdateBy PineapplesUpdateOctober 15, 2025No Comments7 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
    EAGLET boosts AI agent performance on long-range tasks by creating custom plans
    Share
    Facebook Twitter LinkedIn Pinterest Email


    EAGLET boosts AI agent performance on long-range tasks by creating custom plans

    should have been 2025 year of "ai agent," According to Nvidia CEO Jensen Huang and other AI industry personnel. And this, in many ways, is at odds with many major AI model providers such as OpenAI, Google and even Chinese rivals such as Alibaba releasing fine-tuned AI models or applications designed to focus on a narrow set of tasks such as web searching and report writing.

    But a major hurdle remains in the future of highly performing, reliable, AI agents: keeping them on task as the task expands to multiple stages. Third-party benchmark testing Show that even the most powerful AI models experience higher failure rates the more steps they take to complete a task, and the more time (more than hours) they spend on it.

    A New educational framework called EAGLET Proposes a practical and efficient method to improve long-term task performance in LLM-based agents – without the need for manual data labeling or re-training.

    Developed by researchers at Tsinghua University, Peking University, Deeplong AI and the University of Illinois Urbana-Champaign, EAGLET OFFERS A "global planner" Which can be integrated into existing agent workflows to reduce hallucinations and improve work efficiency.

    EAGLET is a streamlined language model that interprets task instructions – usually provided as prompts by the user or the agent’s operating environment – ​​and produces a high-level plan for the agent (driven by its own LLM). It does not interfere during execution, but its advance guidance helps in reducing planning errors and improving the task completion rate.

    Solving the planning problem in long-horizon agents

    Many LLM-based agents struggle with long-horizon tasks because they rely on reactive, step-by-step reasoning. This approach often leads to trial-and-error behavior, planning fallacies, and inefficient trajectories.

    EAGLET deals with this limitation by introducing a global planning module Who works with the executing agent.

    Instead of mixing planning and task creation into a single model, EAGLET separates them, enabling more coherent, task-level strategies.

    Two-stage training pipeline with no human comments

    EAGLET’s planner is trained using a two-step process that does not require any human-written plans or annotations.

    The first step involves building synthetic schemes with high-powered LLMs such as GPT-5 and DeepSeq-v3.1-Think.

    These plans are then filtered using a new strategy called Homologous Consensus Filtering, which retains only those that improve task performance for both expert and novice executioner agents.

    In the second stage, a rule-based reinforcement learning process further refines the planner, using a custom-designed reward function to assess how much each plan helps multiple agents succeed.

    Introduction of Performance Efficiency Gain Award (ECGR)

    One of the key innovations of EAGLET is the Executor Capability Gain Reward (ECGR).

    This reward measures the value of the generated plan by examining whether it helps both high- and low-ability agents complete tasks more successfully and with fewer steps.

    It also includes a decay factor to favor shorter, more efficient work trajectories. This approach avoids overly rewarding schemes that are only useful to already competent agents and promotes more generalizable scheme guidance.

    Compatible with existing agents and models

    The EAGLET planner is designed to be modular and "plug and play," This means it can be inserted into existing agent pipelines without requiring retraining of executors.

    In the evaluation, the planner boosted performance across a variety of foundational models, including GPT-4.1, GPT-5, Llama-3.1, and Qwen2.5.

    It also proved effective regardless of what motivates the strategy, working well with standard React-style signals as well as approaches like Reflexion.

    State-of-the-art performance across all benchmarks

    EAGLET was tested on three widely used benchmarks for long-horizon agent tasks: ScienceWorld, which simulates scientific experiments in a text-based laboratory environment; ALFWorld, which tasks agents with completing household activities through natural language in a simulated home setting; and Webshop, which evaluates goal-driven behavior in a realistic online shopping interface.

    Across all three, execution agents equipped with EAGLET outperformed their non-planning counterparts and other planning baselines, including MPO and KnowAgent.

    In experiments with the open source llama-3.1-8b-instruct model, EAGLET increased the average performance from 39.5 to 59.4, an increase of +19.9 points across tasks.

    On ScienceWorld’s unseen scenarios, this increased performance from 42.2 to 61.6.

    In the scenarios observed by ALFWorld, EAGLET’s results improved from 22.9 to 54.3, a performance increase of more than 2.3×.

    Even stronger gains were seen with more capable models.

    For example, the average score with GPT-4.1 EAGLET improved from 75.5 to 82.2, and GPT-5 increased from 84.5 to 88.1, despite already being a strong performer.

    In some benchmarks, the performance gain was as high as +11.8 points, such as when combining EAGLET with the ETO executor method on ALFWorld invisible tasks.

    Compared to other planning baselines such as MPO, EAGLET consistently delivered high task completion rates. For example, on ALFWorld’s unseen tasks with GPT-4.1, MPO scored 79.1 points, while EAGLET scored 83.6 points – a +4.5 point advantage.

    Additionally, the paper reports that agents using EAGLET complete the task in fewer steps on average. With GPT-4.1 as the executor, the average step count dropped from 13.0 (no planner) to 11.1 (EAGLET). With GPT-5, it dropped from 11.4 to 9.4, supporting the claim of better execution efficiency.

    Efficiency gains in training and execution

    Compared to RL-based methods like GIGPO, which can require hundreds of training iterations, EAGLET achieved better or comparable results with about one eighth of the training effort.

    This efficiency also applies to execution: agents using EAGLET typically require fewer steps to complete tasks. This reduces estimating time and cost calculations in production scenarios.

    No public code—yet

    As of the version submitted to arXiv, the authors have not released an open-source implementation of EAGLET. It is unclear when or under what license the code will be released, or how it will be maintained, which may limit the near-term usefulness of the framework for enterprise deployments.

    VentureBeat has reached out to the authors to clarify these points and will update this piece when we respond.

    Enterprise deployment questions remain

    While Planner is described as plug-and-play, it is unclear whether EAGLET can be easily integrated into popular enterprise agent frameworks like Langchain or Autogen, or whether it requires a custom stack to support planning-execution separation.

    Similarly, the training setup takes advantage of multiple execution agents, which may be difficult to replicate in an enterprise environment with limited model access. VentureBeat asked researchers whether the homegrown consensus filtering method could be adapted for teams that have only one executor model or access to limited compute resources.

    The authors of EAGLET report success across model types and sizes, but it is not yet known what the minimum viable model scale is for practical deployment. For example, can enterprise teams effectively use Planner with a sub-10b parameter open model in latency-sensitive environments? Additionally, the framework may provide industry-specific value in domains such as customer support or IT automation, but it remains to be seen how easily the planner can be fine-tuned or adapted for such verticals.

    Real time vs pre-made planning

    Another open question is how EAGLET is best deployed in practice. Should the planner work in real time with executors within a loop, or is it better to use it offline to pre-generate global plans for known task types? Each approach has implications for latency, cost, and operational complexity. VentureBeat posed this question to writers and will report on any insights that emerge.

    Strategic Tradeoffs for Enterprise Teams

    For technology leaders in medium to large enterprises, EAGLET LLM offers a compelling proof of concept for improving the reliability and efficiency of agents. But without public tooling or implementation guidelines, the framework still presents a build-versus-wait decision. Enterprises must weigh the potential gains in performance and efficiency against the cost of reproducing or approximating the training process in-house.

    Potential use cases in enterprise settings

    For enterprises developing agentic AI systems – especially in environments requiring phased planning, such as IT automation, customer support, or online interactions – EAGLET provides a template for how to incorporate planning without retraining. Its efficient training methodology as well as the ability to direct both open- and closed-source models may make it an attractive starting point for teams looking to improve agent performance with minimal overhead.

    Agent Boosts Creating custom EAGLET longrange performance plans tasks
    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous Article‘Create a crisis’ to motivate employees
    Next Article Cyberengraph of Bundesagentur: Tatverdachtige Gefast
    PineapplesUpdate
    • Website

    Related Posts

    Startups

    This Windows PC can easily replace my Mac Mini when it comes to local AI performance

    November 6, 2025
    AI/ML

    Forget fine-tuning: SAP’s RPT-1 brings ready-to-use AI to business tasks

    November 4, 2025
    AI/ML

    ClickUp adds new AI assistant to better compete with Slack and Notion

    November 4, 2025
    Add A Comment
    Leave A Reply Cancel Reply

    Top Posts

    Microsoft’s new text editor is a VIM and Nano option

    May 19, 2025797 Views

    The best luxury car for buyers for the first time in 2025

    May 19, 2025724 Views

    Massives Datenleck in Cloud-Spichenn | CSO online

    May 19, 2025650 Views
    Stay In Touch
    • Facebook
    • YouTube
    • TikTok
    • WhatsApp
    • Twitter
    • Instagram
    Latest Reviews

    Subscribe to Updates

    Get the latest tech news from FooBar about tech, design and biz.

    Most Popular

    10,000 steps or Japanese walk? We ask experts if you should walk ahead or fast

    June 16, 20250 Views

    FIFA Club World Cup Soccer: Stream Palmirus vs. Porto lives from anywhere

    June 16, 20250 Views

    What do chatbott is careful about punctuation? I tested it with chat, Gemini and Cloud

    June 16, 20250 Views
    Our Picks

    Upgrading your office? 12+ Accessories That Turned My Laptop Into the Ultimate Work Machine

    November 8, 2025

    Amazon is selling the M4 MacBook Air at its lowest price ever – and it’s an easy buy for me

    November 8, 2025

    Need a sleep study? It may soon be as easy as downloading an Apple Watch app

    November 8, 2025

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    Facebook X (Twitter) Instagram Pinterest
    • About Us
    • Contact Us
    • Privacy Policy
    • Terms And Conditions
    • Disclaimer
    © 2025 PineapplesUpdate. Designed by Pro.

    Type above and press Enter to search. Press Esc to cancel.