Close Menu
Pineapples Update –Pineapples Update –

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    Forget Otter.ai: Chat only entered the meeting room

    June 8, 2025

    Save on airpods, ipads, macbooks and more

    June 8, 2025

    These streaming services have the best offline mode for traveling

    June 8, 2025
    Facebook X (Twitter) Instagram
    Facebook X (Twitter) Instagram Pinterest Vimeo
    Pineapples Update –Pineapples Update –
    • Home
    • Gaming
    • Gadgets
    • Startups
    • Security
    • How-To
    • AI/ML
    • Apps
    • Web3
    Pineapples Update –Pineapples Update –
    Home»AI/ML»Qwenlong-L1 for a long time reference rational challenge that stumps current llms
    AI/ML

    Qwenlong-L1 for a long time reference rational challenge that stumps current llms

    PineapplesUpdateBy PineapplesUpdateMay 31, 2025No Comments5 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
    Qwenlong-L1 for a long time reference rational challenge that stumps current llms
    Share
    Facebook Twitter LinkedIn Pinterest Email

    Join our daily and weekly newspapers for exclusive content on the latest updates and industry-composure AI coverage. learn more


    Alibaba Group is introduced Qwenlong-L1A new structure that enables the large language model (LLM) to argue on a very long input. This development can unlock a new wave of enterprise applications, requiring models to understand and draw insight from extensive documents such as wide corporate filing, long financial statements or complex legal contracts.

    Long -term argument challenge for AI

    Through recent progress, especially reinforcement learning (RL) in large logic models (LRM), their problems have been greatly improved. Research suggests that when RL is trained with fine-tuning, LRMs receive the same skills as human “slow thinking”, where they develop a sophisticated strategy to deal with complex tasks.

    However, these improvements are mainly seen when the model works with relatively small pieces of the text, usually about 4,000 tokens. The capacity of these models remains a major challenge to score their logic for longer contexts (eg, 120,000 tokens). Such long-term logic requires a strong understanding of the entire context and the ability to analyze multi-phase. The developers of Qwenlong-L1 said about them, “This limit is an important obstacle for practical applications requiring interaction with external knowledge, such as intensive research, where LRM should collect and process information from the knowledge-intensity environment.” paper,

    Researchers formally form these challenges in the concept of “long reference logic RL”. Unlike short-rendered arguments, which often depends on the already stored knowledge within the model, long reference arguments RL requires models to reconstruct and grass the relevant information from a long-term long input. Only then can they generate a series of arguments based on this involved information.

    The training model is difficult for this through RL and often results in disabled learning and volatile adaptation processes. Models struggle to converge good solutions or lose their ability to detect diverse logic paths.

    Qwenlong-L1: A multi-step approach

    Qwenlong-L1 is a reinforcement of learning outline designed to help LRMS profusely in transition with small texts that are for strong generalization in long contexts. Framework increases the existing short-content LRM through a careful structured, multi-step process:

    Warm-up supervised fine-tuning (SFT): The model first undergoes an SFT phase, where it is trained on examples of longer reference. This phase establishes a solid foundation, allowing the model to accurately enable the ground information from long input. This helps develop fundamental abilities in understanding reference, produces logical logic chain, and extracts the answer.

    Course-directed phased RL: At this stage, the model is trained through several stages, the target length of input documents gradually increases. This systematic, step-by-step approach helps the model refer to its logic strategies for a long time. This often avoids instability when the model is suddenly trained on very long texts.

    Difficulty-incumbent prejudice sampling: The final training phase includes challenging examples from the preceding training stages, ensuring that the model continues to learn from the most difficult problems. This preferences hard examples and encourages the model to detect more diverse and complex arguments.

    Qwenlong-L1 for a long time reference rational challenge that stumps current llms
    Qwenlong-L1 Process Source: Arxiv

    Beyond this structured training, Qwenlong-L1 also uses a separate prize system. While training for miniature contextual argument functions often depends on strict rules-based awards (eg, a correct answer in a mathematics problem), Qwenlong-L1 appoints a hybrid reward mechanism. It combines the rule-based verification, which ensures accuracy by checking for strict adherence for purity criteria, a “with” a “with”Llm-ra-a-judge“This judge model compares the selse of the answer generated with the grassroots truth, allows better handling in more flexibility and diverse methods, the correct answers can be expressed while dealing with long, fine documents.

    Putting Qwenlong-L1 for testing

    The Alibaba team evaluated Qwenlong-L1 using the document question-answer-deeri (Docqa) as a primary task. This landscape is highly relevant to the needs of the enterprise, where AI must understand dense documents to answer complex questions.

    Experimental results in seven long references showed the capabilities of Qwenlong-L1. In particular, Qwenlong-L1-32B model (based on) Deepsek-R 1-Dystil-Quen-32B) Anthropic’s cloud-3.7 achieved equal performance to sonnet thinking, and performed better than models such as OpenEE’s O 3-Mymini and Quven 3-235 B-A22. The small Qwenlong-L1-14B model also improved Google’s Gemini 2.0 flash thinking and Qwen3-32B.

    Source: Arxiv
    Source: Arxiv

    An important discovery relevant to real -world applications is how long reference logic behavior develops in models as a result of RL training. Paper notes that trained with Qwenlong-L1 “grounding” (connecting answers to specific parts of a document), “subgoel settings” (breaking complex questions), “bankcatracing” (to recognize and correct their own mistakes), and “verification” (verification “(to repeat their answers).

    For example, while a base model may differ from irrelevant details in a financial document or get stuck in a loop of more unrelated information, Qwenlong-L1 trained models demonstrated the ability to engage in effective self-confidence. It can successfully filter these disaster details, retreat from wrong paths, and reach the correct answer.

    Techniques like Qwenlong-L1 can expand the utility of AI in the enterprise. Potential applications include legal techniques (analysis of thousands of pages of legal documents), finance (financial filing for intensive research and risk evaluation or investment opportunities on annual reports) and customer service (long customer interaction history analysis to provide more informed support). Researchers have released Code for Qwenlong-L1 recipe And this Weight for trained model,

    Daily insights on business use cases with VB daily

    If you want to impress your boss, VB daily has covered you. We give you the scoop inside what companies are doing with generative AI, from regulatory changes to practical deployment, so you can share insight for maximum ROI.

    Read our privacy policy

    Thanks for membership. See more VB newsletters here.

    There was an error.

    challenge Current LLMS long QwenlongL1 rational reference stumps time
    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleGoogle Photo vs Amazon Photos: Which is better?
    Next Article How to see F1 Spanish Grand Prix 2025 on a free channel
    PineapplesUpdate
    • Website

    Related Posts

    AI/ML

    Forget Otter.ai: Chat only entered the meeting room

    June 8, 2025
    Gadgets

    I have just forgotten this Netflix Survival Thriller Movie – and I am kicking myself to remember it for the first time

    June 8, 2025
    AI/ML

    AI working is a rapid network case, the latest benchmark test show

    June 8, 2025
    Add A Comment
    Leave A Reply Cancel Reply

    Top Posts

    Microsoft’s new text editor is a VIM and Nano option

    May 19, 2025594 Views

    The best luxury car for buyers for the first time in 2025

    May 19, 2025537 Views

    Massives Datenleck in Cloud-Spichenn | CSO online

    May 19, 2025465 Views
    Stay In Touch
    • Facebook
    • YouTube
    • TikTok
    • WhatsApp
    • Twitter
    • Instagram
    Latest Reviews

    Subscribe to Updates

    Get the latest tech news from FooBar about tech, design and biz.

    Most Popular

    The new coding agent of Chatgpt is very big, even if you are not a programmer

    May 16, 20250 Views

    Google’s AI overview is often wrong with so confident that I have lost all confidence in them

    May 16, 20250 Views

    Indiana Jones and The Great Circle’s best side quest is about a Nazi grifter

    May 16, 20250 Views
    Our Picks

    Forget Otter.ai: Chat only entered the meeting room

    June 8, 2025

    Save on airpods, ipads, macbooks and more

    June 8, 2025

    These streaming services have the best offline mode for traveling

    June 8, 2025

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    Facebook X (Twitter) Instagram Pinterest
    • About Us
    • Contact Us
    • Privacy Policy
    • Terms And Conditions
    • Disclaimer
    © 2025 PineapplesUpdate. Designed by Pro.

    Type above and press Enter to search. Press Esc to cancel.