Close Menu
Pineapples Update –Pineapples Update –

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    Samsung showed me its secret HDR10+ Advanced TV samples – and I’m almost sold

    November 8, 2025

    Starbucks barista’s side hustle brings in $1 million a month

    November 8, 2025

    A new Chinese AI model claims to outperform GPT-5 and Sonnet 4.5 – and it’s free

    November 8, 2025
    Facebook X (Twitter) Instagram
    Facebook X (Twitter) Instagram Pinterest Vimeo
    Pineapples Update –Pineapples Update –
    • Home
    • Gaming
    • Gadgets
    • Startups
    • Security
    • How-To
    • AI/ML
    • Apps
    • Web3
    Pineapples Update –Pineapples Update –
    Home»AI/ML»Researchers find that retraining only small parts of an AI model can cut costs and prevent forgetting
    AI/ML

    Researchers find that retraining only small parts of an AI model can cut costs and prevent forgetting

    PineapplesUpdateBy PineapplesUpdateOctober 14, 2025No Comments3 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
    Researchers find that retraining only small parts of an AI model can cut costs and prevent forgetting
    Share
    Facebook Twitter LinkedIn Pinterest Email


    Researchers find that retraining only small parts of an AI model can cut costs and prevent forgetting

    Enterprises often find that when they refine modelsAn effective way to make a large language model (LLM) fit for purpose and based on data is to make the model lose some of its capabilities. After fine-tuning, some models “forget” how to perform certain tasks or perform other functions that they have already learned.

    Research from the University of Illinois Urbana-Champaign proposes a new method for retraining models that avoids the “catastrophic mistake”, in which the model loses some of its prior knowledge. The paper focuses on two specific LLMs that generate responses from images: LLAVA and QUEN2.5-VL.

    This approach encourages enterprises to retrain only narrow parts of the LLM to avoid retraining the entire model and significantly increase computation costs. The team claims that catastrophic forgetting is not actual memory loss, but a side effect of bias drift.

    “Training a new LMM can cost millions of dollars, weeks of time, and emit hundreds of tons of CO2, so finding ways to more efficiently and effectively update existing models is a serious concern,” the team wrote. paper“Guided by this result, we explore tuning recipes that preserve learning while limiting output shift.”

    The researchers focused on the multi-layer perceptron (MLP), the internal decision-making component of the model.

    disastrous mistake

    The researchers first wanted to verify the existence and cause of the catastrophic omission in the models.

    To do this, they created a set of target functions for the models to complete. The models were then corrected and evaluated to determine whether they accounted for substantial error. But as the process progressed, researchers found that the models were regaining some of their abilities.

    “We also saw a surprising result, that while the model’s performance would drop significantly in the set benchmarks after training on the counting task, it would mostly recover on PathVQA, another specialized task that is not well represented in the benchmarks,” he said. “Meanwhile, when performing the forgetting mitigation experiments, we also tried tuning only the self-attention projection (SA Proj) or the MLP layers separately, inspired by the finding that tuning only the LLM was generally better than tuning the full model. This led to another very surprising result – that tuning only the self-attention projection layers led to very good learning of the target functions and There was no decline in performance on the organized tasks even after training on all five target tasks. A sequence.”

    The researchers said they believe “what appears to be forgetting or interference after fine-tuning on a narrow target task is actually a bias in the output distribution due to task distribution shifts.”

    narrow retraining

    That discovery proved to be the key to the experiment. The researchers noted that tuning the MLP increases the likelihood of outputting numerical tokens and “highly correlated declines in conducted task accuracy”. This showed that a model forgetting some of its knowledge is only temporary and not a long-term affair.

    “To avoid biases in the output distribution, we tune the MLP up/gating projections while keeping the down projection constant, and find that this achieves the same learning as full MLP tuning with little error,” the researchers said.

    This allows a simpler and more reproducible method for fine-tuning a model.

    By focusing on a narrow section of the model rather than wholesale retraining, enterprises can cut computation costs. It also allows better control of output drift.

    However, research has focused on only two models, specifically models related to vision and language. The researchers said they were unable to experiment with other models due to limited resources.

    However, their findings can be extended to other LLMs, especially for different modalities.

    costs cut find forgetting model Parts prevent Researchers retraining small
    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleSimonMed says January data breach affected 1.2 million patients
    Next Article One of our favorite Android phones of 2025 is now $150 off at Best Buy
    PineapplesUpdate
    • Website

    Related Posts

    Startups

    A new Chinese AI model claims to outperform GPT-5 and Sonnet 4.5 – and it’s free

    November 8, 2025
    Startups

    Letting AI manage your money could be a real gamble, researchers warn

    November 6, 2025
    AI/ML

    Forget fine-tuning: SAP’s RPT-1 brings ready-to-use AI to business tasks

    November 4, 2025
    Add A Comment
    Leave A Reply Cancel Reply

    Top Posts

    Microsoft’s new text editor is a VIM and Nano option

    May 19, 2025797 Views

    The best luxury car for buyers for the first time in 2025

    May 19, 2025724 Views

    Massives Datenleck in Cloud-Spichenn | CSO online

    May 19, 2025650 Views
    Stay In Touch
    • Facebook
    • YouTube
    • TikTok
    • WhatsApp
    • Twitter
    • Instagram
    Latest Reviews

    Subscribe to Updates

    Get the latest tech news from FooBar about tech, design and biz.

    Most Popular

    10,000 steps or Japanese walk? We ask experts if you should walk ahead or fast

    June 16, 20250 Views

    FIFA Club World Cup Soccer: Stream Palmirus vs. Porto lives from anywhere

    June 16, 20250 Views

    What do chatbott is careful about punctuation? I tested it with chat, Gemini and Cloud

    June 16, 20250 Views
    Our Picks

    Samsung showed me its secret HDR10+ Advanced TV samples – and I’m almost sold

    November 8, 2025

    Starbucks barista’s side hustle brings in $1 million a month

    November 8, 2025

    A new Chinese AI model claims to outperform GPT-5 and Sonnet 4.5 – and it’s free

    November 8, 2025

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    Facebook X (Twitter) Instagram Pinterest
    • About Us
    • Contact Us
    • Privacy Policy
    • Terms And Conditions
    • Disclaimer
    © 2025 PineapplesUpdate. Designed by Pro.

    Type above and press Enter to search. Press Esc to cancel.