
Enterprises often find that when they refine modelsAn effective way to make a large language model (LLM) fit for purpose and based on data is to make the model lose some of its capabilities. After fine-tuning, some models “forget” how to perform certain tasks or perform other functions that they have already learned.
Research from the University of Illinois Urbana-Champaign proposes a new method for retraining models that avoids the “catastrophic mistake”, in which the model loses some of its prior knowledge. The paper focuses on two specific LLMs that generate responses from images: LLAVA and QUEN2.5-VL.
This approach encourages enterprises to retrain only narrow parts of the LLM to avoid retraining the entire model and significantly increase computation costs. The team claims that catastrophic forgetting is not actual memory loss, but a side effect of bias drift.
“Training a new LMM can cost millions of dollars, weeks of time, and emit hundreds of tons of CO2, so finding ways to more efficiently and effectively update existing models is a serious concern,” the team wrote. paper“Guided by this result, we explore tuning recipes that preserve learning while limiting output shift.”
The researchers focused on the multi-layer perceptron (MLP), the internal decision-making component of the model.
disastrous mistake
The researchers first wanted to verify the existence and cause of the catastrophic omission in the models.
To do this, they created a set of target functions for the models to complete. The models were then corrected and evaluated to determine whether they accounted for substantial error. But as the process progressed, researchers found that the models were regaining some of their abilities.
“We also saw a surprising result, that while the model’s performance would drop significantly in the set benchmarks after training on the counting task, it would mostly recover on PathVQA, another specialized task that is not well represented in the benchmarks,” he said. “Meanwhile, when performing the forgetting mitigation experiments, we also tried tuning only the self-attention projection (SA Proj) or the MLP layers separately, inspired by the finding that tuning only the LLM was generally better than tuning the full model. This led to another very surprising result – that tuning only the self-attention projection layers led to very good learning of the target functions and There was no decline in performance on the organized tasks even after training on all five target tasks. A sequence.”
The researchers said they believe “what appears to be forgetting or interference after fine-tuning on a narrow target task is actually a bias in the output distribution due to task distribution shifts.”
narrow retraining
That discovery proved to be the key to the experiment. The researchers noted that tuning the MLP increases the likelihood of outputting numerical tokens and “highly correlated declines in conducted task accuracy”. This showed that a model forgetting some of its knowledge is only temporary and not a long-term affair.
“To avoid biases in the output distribution, we tune the MLP up/gating projections while keeping the down projection constant, and find that this achieves the same learning as full MLP tuning with little error,” the researchers said.
This allows a simpler and more reproducible method for fine-tuning a model.
By focusing on a narrow section of the model rather than wholesale retraining, enterprises can cut computation costs. It also allows better control of output drift.
However, research has focused on only two models, specifically models related to vision and language. The researchers said they were unable to experiment with other models due to limited resources.
However, their findings can be extended to other LLMs, especially for different modalities.

