Close Menu
Pineapples Update –Pineapples Update –

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    This $30 Gadget Keeps My Office and Workspace Organized at All Times – How It Works

    November 7, 2025

    I tried the only agentive browser that runs native AI – and found only one downside

    November 7, 2025

    Get 4 Free iPhone 17 or Galaxy S25 Phones from T-Mobile Right Now – Here’s How

    November 7, 2025
    Facebook X (Twitter) Instagram
    Facebook X (Twitter) Instagram Pinterest Vimeo
    Pineapples Update –Pineapples Update –
    • Home
    • Gaming
    • Gadgets
    • Startups
    • Security
    • How-To
    • AI/ML
    • Apps
    • Web3
    Pineapples Update –Pineapples Update –
    Home»AI/ML»Nvidia researchers unlock 4-bit LLM training that matches 8-bit performance
    AI/ML

    Nvidia researchers unlock 4-bit LLM training that matches 8-bit performance

    PineapplesUpdateBy PineapplesUpdateOctober 30, 2025No Comments5 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
    Nvidia researchers unlock 4-bit LLM training that matches 8-bit performance
    Share
    Facebook Twitter LinkedIn Pinterest Email


    Nvidia researchers unlock 4-bit LLM training that matches 8-bit performance

    Nvidia researchers have developed a innovative approach To train large language models (LLM) in 4-bit quantized format while maintaining their stability and accuracy at the level of high precision models. Their technology, NVFP4, makes it possible to train models that not only outperform other major 4-bit formats but match the performance of the larger 8-bit FP8 format while using half the memory and a fraction of the computation.

    The success of NVFP4 shows that enterprises can continue to reduce projected costs by running smaller models that match the performance of larger models. It also points to a future where the cost of training LLMs will fall to the point where many more organizations can train their own custom models from scratch rather than fine-tuning existing models.

    quantification challenge

    model quantization AI is a technique used to reduce the computational and memory costs of running and training models. It works by converting the model’s parameters, or weights, from high-precision formats such as 16- and 32-bit floating point (BF16 and FP32) to low-precision formats. The main challenge of quantization is to reduce the size of the model while preserving its knowledge and capabilities as much as possible.

    In recent years, the 8-bit floating point format (FP8) has become a popular industry standard, providing a good balance between performance and efficiency. They significantly reduce the computational cost and memory demands for LLM training without large degradation in accuracy.

    The next logical step is 4-bit floating point (FP4), which promises to halve memory usage again and further boost performance on advanced hardware. However, this transition has been challenging. Existing 4-bit formats, such as MXFP4, often struggle to maintain the same level of accuracy as their 8-bit counterparts, leading to a difficult compromise between cost and performance.

    How does NVFP4 work?

    nVFP4 overcomes the stability and accuracy challenges of other FP4 techniques through an improved design and targeted training methodology. A major issue with 4-bit precision is its extremely limited range: it can only represent 16 different values. When converting from a high-precision format, outlying values ​​can distort the entire dataset, harming the accuracy of the model. NVFP4 uses a more sophisticated, multi-level scaling approach that handles these outliers better, allowing "More precise and accurate representation of tensor values ​​during training," According to Nvidia.

    Beyond the format, the researchers present a 4-bit training recipe that achieves accuracy comparable to FP8. A central component is his “mixed-precision strategy”. Instead of converting the entire model to NVFP4, most layers are quantized, while a small fraction of numerically sensitive layers are kept in a higher-precision format such as BF16. It maintains stability where it matters most. The methodology also adjusts how gradients are calculated during backpropagation – or the learning phase of the model – to reduce biases that accumulate from low-precision arithmetic.

    NVFP4 in practice

    To test their approach, the Nvidia team trained a powerful 12-billion-parameter hybrid Mamba-Transformer Model At a massive 10 trillion tokens. They then compared its performance with a baseline model trained in the widely popular FP8 format. The results showed that the training loss and downstream task accuracy of the nvFP4 model closely tracked the FP8 version throughout the process.

    Performance was conducted across a wide range of domains, including knowledge-intensive reasoning, mathematics, and general knowledge tasks, with only a slight decline in the coding benchmark in late training.

    "“To our knowledge, this is the first successful demonstration of training a billion-parameter language model with 4-bit precision over a multi-trillion-token horizon, laying the foundation for faster and more efficient training of future frontier models,” the researchers wrote.

    In practice, NVFP4’s 4-bit precision format enables developers and businesses to train and deploy AI models with the same accuracy as traditional 8-bit formats, according to Nvidiashar Narasimhan, product director of AI and data center GPUs at Nvidia.

    “By training model weights directly in a 4-bit format while maintaining accuracy, it empowers developers to experiment with new architectures, iterate faster, and uncover insights without being hamstrung by resource constraints,” he told VentureBeat.

    In contrast, FP8 (already a leap forward from FP16) still imposes limits on model size and inference performance due to high memory and bandwidth demands. “NvFP4 breaks that boundary, providing equivalent quality with dramatically more scope for development and experimentation,” said Narasimhan.

    When compared to an alternative 4-bit format, MXFP4, the benefits of nVFP4 become even more apparent. In an experiment with an 8-billion-parameter model, nvFP4 converged to a better loss score than mxFP4. To reach the same performance level as the NVFP4 model, the MXFP4 model had to be trained on 36% more data, significantly increasing training time and cost.

    In addition to making pretraining more efficient, NVFP4 also redefines what is possible. “Showing that 4-bit precision can preserve model quality at scale opens the door to a future where highly specialized models can be trained by medium-sized enterprises or startups, not just hyperscalers,” Narasimhan said.

    beyond prior training

    Although this paper focuses on the benefits of nvFP4 during pretraining, its impact also extends to inference.

    “Models trained on NVFP4 can not only provide faster inference and higher throughput, but also reduce the time required for AI factories to achieve ROI – accelerating the cycle from model development to real-world deployment,” Narasimhan said.

    Because these models are smaller and more efficient, they unlock new possibilities for serving complex, high-quality responses in real time, even in token-intensive, agentic applications, without increasing energy and computation costs.

    Narasimhan said he is looking towards a future of model efficiency that is not just about reducing precision, but about creating smarter systems.

    “There are many opportunities to expand research into lower precision as well as modify the architecture to address components that dominate computation in large-scale models,” he said. “These areas are rich with opportunities, especially as we move toward agentic systems that demand higher throughput, lower latency, and adaptive reasoning. NVFP4 proves that precision can be optimized without compromising quality, and it sets the stage for a new era of intelligent, efficient AI design.”

    4bit 8bit LLM matches NVIDIA performance Researchers training unlock
    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleMicrosoft Said My PC Can’t Run Windows 11, But I Still Upgraded in 5 Minutes – Here’s How
    Next Article Hypersonic Levitation Spinning Speed ​​Cell Separation
    PineapplesUpdate
    • Website

    Related Posts

    Startups

    Letting AI manage your money could be a real gamble, researchers warn

    November 6, 2025
    Startups

    This Windows PC can easily replace my Mac Mini when it comes to local AI performance

    November 6, 2025
    Startups

    NVIDIA, Qualcomm join US, Indian VCs to help build India’s next deep tech startup

    November 5, 2025
    Add A Comment
    Leave A Reply Cancel Reply

    Top Posts

    Microsoft’s new text editor is a VIM and Nano option

    May 19, 2025797 Views

    The best luxury car for buyers for the first time in 2025

    May 19, 2025724 Views

    Massives Datenleck in Cloud-Spichenn | CSO online

    May 19, 2025650 Views
    Stay In Touch
    • Facebook
    • YouTube
    • TikTok
    • WhatsApp
    • Twitter
    • Instagram
    Latest Reviews

    Subscribe to Updates

    Get the latest tech news from FooBar about tech, design and biz.

    Most Popular

    10,000 steps or Japanese walk? We ask experts if you should walk ahead or fast

    June 16, 20250 Views

    FIFA Club World Cup Soccer: Stream Palmirus vs. Porto lives from anywhere

    June 16, 20250 Views

    What do chatbott is careful about punctuation? I tested it with chat, Gemini and Cloud

    June 16, 20250 Views
    Our Picks

    This $30 Gadget Keeps My Office and Workspace Organized at All Times – How It Works

    November 7, 2025

    I tried the only agentive browser that runs native AI – and found only one downside

    November 7, 2025

    Get 4 Free iPhone 17 or Galaxy S25 Phones from T-Mobile Right Now – Here’s How

    November 7, 2025

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    Facebook X (Twitter) Instagram Pinterest
    • About Us
    • Contact Us
    • Privacy Policy
    • Terms And Conditions
    • Disclaimer
    © 2025 PineapplesUpdate. Designed by Pro.

    Type above and press Enter to search. Press Esc to cancel.