Close Menu
Pineapples Update –Pineapples Update –

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    I tried 0patch as a last resort for my Windows 10 PC – here’s how it compares to its promises

    January 20, 2026

    A PC Expert Explains Why Don’t Use Your Router’s USB Port When These Options Are Present

    January 20, 2026

    New ‘Remote Labor Index’ shows AI fails 97% of the time in freelancer tasks

    January 19, 2026
    Facebook X (Twitter) Instagram
    Facebook X (Twitter) Instagram Pinterest Vimeo
    Pineapples Update –Pineapples Update –
    • Home
    • Gaming
    • Gadgets
    • Startups
    • Security
    • How-To
    • AI/ML
    • Apps
    • Web3
    Pineapples Update –Pineapples Update –
    Home»AI/ML»Small Language Models: Edge AI Innovation from AI21
    AI/ML

    Small Language Models: Edge AI Innovation from AI21

    PineapplesUpdateBy PineapplesUpdateOctober 8, 2025No Comments4 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
    Small Language Models: Edge AI Innovation from AI21
    Share
    Facebook Twitter LinkedIn Pinterest Email


    Small Language Models: Edge AI Innovation from AI21

    While much of the AI ​​world is racing to build big language models like OpenAI’s GPT-5 and Anthropic’s Cloud Sonnet 4.5, Israeli AI startup AI21 Taking a different path.

    AI21 has just been unveiled Jamba Reasoning 3BA 3-billion-parameter model. This compact, open-source model can handle huge context windows of 250,000 tokens (meaning it can “remember” and reason about much more text than typical language models) and can run at high speed even on consumer devices. The launch highlights a growing shift: smaller, more efficient models could shape the future of AI just as much as raw scale.

    “We believe in a more decentralized future for AI – where not everything runs in massive data centers,” says ori goshenIn an interview with Co-CEO of AI21 ieee spectrum“Large models will still play a role, but smaller, powerful models running on devices will have a significant impact on both the future and economics of AI,” he says. Jamba is built for developers who want to build edge-AI applications and specialized systems that run efficiently on devices.

    AI21’s Jamba Reasoning 3B is designed to handle long sequences of text and challenging tasks like math, coding and logical reasoning – all while running with impressive speed on everyday devices like laptops and mobile phones. Jamba Reasoning 3B can also work in hybrid setups: simple tasks are handled locally by the device, while heavier problems are sent to powerful cloud servers. According to AI21, this smart routing could dramatically cut the cost of AI infrastructure for some workloads – possibly by orders of magnitude.

    A small but mighty LLM

    With 3 billion parameters, Jamba Reasoning 3B is small by today’s AI standards. Goshen noted that models like GPT-5 or Cloud exceed 100 billion parameters and even small models like Llama 3 (8b) or Mistral (7b) are more than twice the size of AI21’s models.

    That compact size makes it all the more remarkable that AI21’s model can handle a reference window of 250,000 tokens on consumer devices. Some proprietary models, such as GPT-5, provide even longer context windows, but Jamba sets a new high-water mark among open-source models. previous open-model Record of 128,000 tokens was organized by Meta’s Llama 3.2 (3b), Microsoft’s Phi-4 Mini, and DeepSeek R1, which are all much larger models. Jamba Reasoning 3B can process more than 17 tokens per second even when operating at full capacity– that is, with Extremely long inputs that use its full 250,000-token reference window. Many other models slow down or struggle when their input length exceeds 100,000 tokens.

    Goshen explains that this model is built on an architecture called JambaWhich combines two types of neural network designs: Transformer Layers, familiar from other large language models, and mamba Layers, which are designed to be more memory-efficient. This hybrid design enables the model to handle long documents, large codebases, and other extensive inputs directly on a laptop or phone using about one-tenth the memory of a traditional Transformer. Goshen says the model runs much faster than traditional Transformers because it relies less on a memory component called KV CashWhich can slow down processing due to the input being long.

    Why is there a need for a short LLM?

    The hybrid architecture of the model gives it an advantage in both speed and memory efficiency, even with very long inputs, confirms a software engineer working in the LLM industry. The engineer requested to remain anonymous because he is not authorized to comment on other companies’ models. As more users run generative AI locally on laptops, models need to quickly handle long context lengths without consuming too much memory. The engineer says that at 3 billion parameters, Jamba meets these requirements, making it a model that is optimized for on-device use.

    Jamba is open source under Reasoning 3B permissiveness Apache 2.0 License and is available on popular platforms like hugging face And LM StudioThis release also comes with instructions on how to fine-tune the model through an open-source reinforcement-learning platform (called VERL), making it easier and more economical for developers to adapt models to their tasks.

    “Jamba Reasoning 3B marks the beginning of a family of small, efficient reasoning models,” Goshen said. “Scaling down enables decentralization, personalization, and cost efficiency. Instead of relying on expensive GPUs in data centers, individuals and enterprises can run their own models on devices. This opens up new economies of scale and broader reach.”

    From articles on your site

    Related articles on the web

    AI21 edge innovation language Models small
    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleRecent grad used ‘income stacking’ to earn $144k: report
    Next Article I test robot vacuums for work and my favorite vacuum is on sale for a few more hours at $300 off
    PineapplesUpdate
    • Website

    Related Posts

    Startups

    How is the battery life of this $600 HP laptop better than some of the latest models?

    January 18, 2026
    Startups

    I compared the two best LG OLED TV models on the market right now – there’s a surprise winner

    January 17, 2026
    Startups

    Why I prefer this $200 Motorola phone to cheaper models from Google and Samsung

    January 4, 2026
    Add A Comment
    Leave A Reply Cancel Reply

    Top Posts

    Microsoft’s new text editor is a VIM and Nano option

    May 19, 2025797 Views

    The best luxury car for buyers for the first time in 2025

    May 19, 2025724 Views

    Massives Datenleck in Cloud-Spichenn | CSO online

    May 19, 2025650 Views
    Stay In Touch
    • Facebook
    • YouTube
    • TikTok
    • WhatsApp
    • Twitter
    • Instagram
    Latest Reviews

    Subscribe to Updates

    Get the latest tech news from FooBar about tech, design and biz.

    Most Popular

    10,000 steps or Japanese walk? We ask experts if you should walk ahead or fast

    June 16, 20250 Views

    FIFA Club World Cup Soccer: Stream Palmirus vs. Porto lives from anywhere

    June 16, 20250 Views

    What do chatbott is careful about punctuation? I tested it with chat, Gemini and Cloud

    June 16, 20250 Views
    Our Picks

    I tried 0patch as a last resort for my Windows 10 PC – here’s how it compares to its promises

    January 20, 2026

    A PC Expert Explains Why Don’t Use Your Router’s USB Port When These Options Are Present

    January 20, 2026

    New ‘Remote Labor Index’ shows AI fails 97% of the time in freelancer tasks

    January 19, 2026

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    Facebook X (Twitter) Instagram Pinterest
    • About Us
    • Contact Us
    • Privacy Policy
    • Terms And Conditions
    • Disclaimer
    © 2026 PineapplesUpdate. Designed by Pro.

    Type above and press Enter to search. Press Esc to cancel.