Close Menu
Pineapples Update –Pineapples Update –

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    Smart ring maker Ora expects sales to reach $2 billion next year

    November 12, 2025

    Is iRobot dying? What to know before buying Roomba Black Friday deals

    November 12, 2025

    Free Webinar Nov 19: Rise Above the Noise: How to Build Your Personal Brand to Grow Your Business

    November 12, 2025
    Facebook X (Twitter) Instagram
    Facebook X (Twitter) Instagram Pinterest Vimeo
    Pineapples Update –Pineapples Update –
    • Home
    • Gaming
    • Gadgets
    • Startups
    • Security
    • How-To
    • AI/ML
    • Apps
    • Web3
    Pineapples Update –Pineapples Update –
    Home»AI/ML»Inside Ring-1T: Ant engineers solve the hurdles of reinforcement learning at trillion scale
    AI/ML

    Inside Ring-1T: Ant engineers solve the hurdles of reinforcement learning at trillion scale

    PineapplesUpdateBy PineapplesUpdateOctober 25, 2025No Comments4 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
    Inside Ring-1T: Ant engineers solve the hurdles of reinforcement learning at trillion scale
    Share
    Facebook Twitter LinkedIn Pinterest Email


    Inside Ring-1T: Ant engineers solve the hurdles of reinforcement learning at trillion scale

    from china ant groupAn affiliate of Alibaba, detailed technical information about its new model, Ring-1TWhat the company calls “the first open-source reasoning model with one trillion total parameters.”

    The Ring-1T aims to compete with other reasoning models such as the GPT-5 and O-series OpenAIas well as GoogleGemini 2.5. With new release of latest model, Ant expands geopolitical debate over who will Dominate the AI ​​race: China or America.

    Ant Group said Ring-1T is optimized for mathematical and logical problems, code generation and scientific problem-solving.

    “With approximately 50 billion active parameters per token, Ring-1T achieves state-of-the-art performance across multiple challenging benchmarks – despite relying entirely on natural language reasoning capabilities,” said Ant. a paper,

    Ring-1T, which was first released on preview in September, adopts the same architecture as Ring 2.0 and is trained on the Ring-1T-base model that the company released earlier this month. Ant said this allows the model to support up to 128,000 tokens.

    To train large models like Ring-1T, researchers had to develop new methods to enhance reinforcement learning (RL).

    new methods of training

    The Ant Group developed three “interconnected innovations” to support RL and training of Ring-1T, a challenge given the model’s size and generally large computation requirements. These three are Icepop, C3PO++ and ASystem.

    Icepop removes noisy gradient updates to stabilize training without slowing down inference. This helps eliminate destructive training-prediction misalignment in RL. The researchers noted that when training models, especially using mixture-of-experts (MOE) architectures such as Ring-1T, there can often be inconsistency in probability calculations.

    “This problem is particularly pronounced in training MOE models with RL due to the implicit use of dynamic routing mechanisms. Additionally, in long COT settings, these inconsistencies can gradually accumulate and propagate across iterations,” the researchers said.

    Icepop “suppresses unstable training updates through two-way masking calibration.”

    The researchers next had to develop the new method C3PO++, an improved version of the C3PO system previously established by Ant. The method manages how Ring-1T and other extra-large parameter models generate and process training examples, or what they call rollout, so that GPUs don’t sit idle.

    The way it works is it will break the work into pieces in the rollout to process in parallel. One group is the inference pool, which generates new data, and the other is the training pool, which collects results to update the model. C3PO++ creates a token budget to control how much data is processed, ensuring that the GPU is used efficiently.

    The final new method, ASystem, adopts a single controller + SPMD (single program, multiple data) architecture to enable asynchronous operation.

    benchmark results

    Ant pointed the Ring-1T to benchmarks measuring performance in math, coding, logical reasoning, and general tasks. They tested it against models like DeepSeq-v3.1-terminus-thinking, QUEN-35b-a22b-thinking-2507, Gemini 2.5 Pro, and GPT-5 thinking.

    In benchmark testing, Ring-1T performed strongly and ranked second behind OpenAI’s GPT-5 in most benchmarks. Ant said the Ring-1T performed the best among all the open-weight models tested.

    The model posted a score of 93.4% on the AIME 25 leaderboard, second only to GPT-5. In coding, Ring-1T outperformed both DeepSeek and Quen.

    “This indicates that our carefully synthesized dataset shapes Ring-1T’s strong performance on programming applications, creating a strong foundation for future efforts on agentic applications,” the company said.

    Ring-1T shows how much Chinese companies are investing in the models

    The Ring-1T is China’s latest model which aims to dethrone the GPT-5 and Gemini.

    Since DeepSeek’s surprise launch in January, Chinese companies have been releasing impressive models at a rapid pace. Ant’s parent company, alibabarecently released QUEN3-OmniA multimodal model that seamlessly integrates text, image, audio and video. DeepSeek has also continued to improve its models and earlier this month, DeepSeek-OCR launchedThis new model reimagines how models process information.

    The battle for AI dominance between the US and China continues to escalate, with Ring-1T and Ant developing new ways to train and scale extra-large models.

    ant Engineers hurdles Learning reinforcement Ring1T scale solve trillion
    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleWindows 11’s Snipping Tool just got a Google Lens-like feature – here’s how to use it
    Next Article Finally, a power bank that charges my MacBook Pro instantly (and is safe for flights)
    PineapplesUpdate
    • Website

    Related Posts

    Startups

    Tesla shareholders ratify Elon Musk’s $1 trillion salary

    November 8, 2025
    AI/ML

    Forget fine-tuning: SAP’s RPT-1 brings ready-to-use AI to business tasks

    November 4, 2025
    AI/ML

    ClickUp adds new AI assistant to better compete with Slack and Notion

    November 4, 2025
    Add A Comment
    Leave A Reply Cancel Reply

    Top Posts

    Microsoft’s new text editor is a VIM and Nano option

    May 19, 2025797 Views

    The best luxury car for buyers for the first time in 2025

    May 19, 2025724 Views

    Massives Datenleck in Cloud-Spichenn | CSO online

    May 19, 2025650 Views
    Stay In Touch
    • Facebook
    • YouTube
    • TikTok
    • WhatsApp
    • Twitter
    • Instagram
    Latest Reviews

    Subscribe to Updates

    Get the latest tech news from FooBar about tech, design and biz.

    Most Popular

    10,000 steps or Japanese walk? We ask experts if you should walk ahead or fast

    June 16, 20250 Views

    FIFA Club World Cup Soccer: Stream Palmirus vs. Porto lives from anywhere

    June 16, 20250 Views

    What do chatbott is careful about punctuation? I tested it with chat, Gemini and Cloud

    June 16, 20250 Views
    Our Picks

    Smart ring maker Ora expects sales to reach $2 billion next year

    November 12, 2025

    Is iRobot dying? What to know before buying Roomba Black Friday deals

    November 12, 2025

    Free Webinar Nov 19: Rise Above the Noise: How to Build Your Personal Brand to Grow Your Business

    November 12, 2025

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    Facebook X (Twitter) Instagram Pinterest
    • About Us
    • Contact Us
    • Privacy Policy
    • Terms And Conditions
    • Disclaimer
    © 2025 PineapplesUpdate. Designed by Pro.

    Type above and press Enter to search. Press Esc to cancel.