Close Menu
Pineapples Update –Pineapples Update –

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    Warning: Flaws in Copland OT controllers can be leveraged by danger actors

    September 3, 2025

    In 18 months, my iPhone’s battery life has become terrible from great

    September 3, 2025

    Claudflare stopped the new world’s largest DDOS attack on Labor Day Weekend

    September 3, 2025
    Facebook X (Twitter) Instagram
    Facebook X (Twitter) Instagram Pinterest Vimeo
    Pineapples Update –Pineapples Update –
    • Home
    • Gaming
    • Gadgets
    • Startups
    • Security
    • How-To
    • AI/ML
    • Apps
    • Web3
    Pineapples Update –Pineapples Update –
    Home»AI/ML»Hugging Face: 5 methods can reduce AI cost without renouncing enterprises performance
    AI/ML

    Hugging Face: 5 methods can reduce AI cost without renouncing enterprises performance

    PineapplesUpdateBy PineapplesUpdateAugust 19, 2025No Comments7 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
    Hugging Face: 5 methods can reduce AI cost without renouncing enterprises performance
    Share
    Facebook Twitter LinkedIn Pinterest Email

    Want smart insight into your inbox? Enterprise AI, only what matters to data and security leaders, sign up for our weekly newspapers. Subscribe now


    Enterprises accept it as a basic fact: the AI model requires a significant amount of calculation; All they have to do is find ways to get it more.

    But this is not like this, according to Sasha Luxioni, on AI and Climate lead Throat faceWhat if AI is a clever way to use? What if, instead of doing more (often unnecessary) calculations and striving for the way of strengthening it, they can focus on model performance and improvement in accuracy?

    Finally, model manufacturer and enterprises are focusing on the wrong issue: they should be computing SmartNot hard or doing more, Luciani says.

    “There are clever ways to do things we are currently under-expling, because we are very blind: we need more flops, we need more GPU, we need more time,” she said.


    AI scaling hits its boundaries

    Power caps, rising token costs, and entrance delays are re -shaping Enterprise AI. Join our exclusive salons to learn about top teams:

    • Transform energy into a strategic profit
    • Architecting efficient estimates for real thrruput benefits
    • Unlocking competitive ROI with sustainable AI system

    Secure your location to stay ahead,


    There are five major learned from the embrace face that can help using all sizes of enterprises to use AI more efficiently.

    1: Give the model correct shape for work

    Avoid being default from huge, general-purpose models for every use case. Task-specific or distilled models can match, or even Large models in terms of accuracy for targeted charge – at low cost and with low energy consumption,

    Luceoni, in fact, the test has found that a working-specific model uses 20 to 30 times less energy than a common-purpose. “Because it is a model that can do that one task, as opposite to any task that you throw on it, often with the big language model,” she said.

    Distillation is important here; A full model can initially be trained with scratches and then refined for a specific task. For example, Deepsek R1, “so huge that most organizations cannot tolerate it to use it” because you need at least 8 GPUs, LUCONOI said. In contrast, distilled versions can be 10, 20 or 30x smaller and walk on the same GPU.

    In general, open-sources models help with efficiency, they said, because they do not need to train with scratches. It compared it a few years ago, when enterprises were wasting resources because they could not find the model they needed; Nowadays, they can start with a base model and fine-tune and adapt it.

    “It provides incremental shared innovation, as unlike to be silent, everyone trains their model on their dataset and is essentially ruin in the process,” Luscioni said.

    It is clear that companies are getting disillusioned with General AI quickly, as the costs are not yet in proportion to the benefits. Cases of generic use, such as writing emails or transfing up meeting notes, are really helpful. However, the task-specific model still requires “lots of work”, as the out-of-the-box models do not bite it and it is also more expensive, Lucioni said.

    This is the next range of additional value. “A lot of companies want a specific task,” said LuCioni. “They do not want AGI, they want specific intelligence. And this is the difference that needs to be bridged.”

    2. Make efficiency default

    Adopt “news theory” in system design, set the conservative logic budget, always limit general features and require opt-in for high cost compute mode.

    In cognitive science, “news theory” is a behavioral change management approach designed to affect human behavior subtle. “Canonical example,” Luscioni said, adding to the cutlery to take out: people have to decide if they want plastic utensils, instead of its automatically to include them with every order, can significantly reduce the waste.

    “Just people have to choose something to get out of something, actually a very powerful mechanism to change people’s behavior,” Luciani said.

    The default mechanisms are also unnecessary, as they increase the use and therefore, the cost because the models are working more than the requirement. For example, with popular search engines such as Google, a general AI summary automatically populates on top by default. Luccioni also said that, when it recently used Openai’s GPT-5, the model automatically worked in full logic mode on “very simple questions”.

    “For me, this should be an exception,” he said. “Like, what is the meaning of life, then sure, I need a General AI summary.” But ‘How is the weather in Montreal,’ or ‘What are the early hours of my local pharmacy?’ I do not need a generative AI summary, yet it is a default.

    3. Adapt to hardware uses

    Use batching; Adjust the size of the exact and fine-tune batch for specific hardware generations to reduce waste memory and power draw.

    For example, enterprises should ask themselves: Should the model be all the time? Will people be pinging it in real time, 100 requests at a time? In that case, it is always necessary to adapt, Lucioni noted. However, in many others, it is not; The model can be run from time to time to customize memory usage, and batching optimal memory can ensure use.

    “It is like an engineering challenge, but a very specific, so it is difficult to say,” Just disturb all models, “or change accuracy on all models,” said Luskioni.

    In his recent studies, he found that the size of the batch depends on the hardware, even under the specific type or version. The use of energy can increase by moving from the size of a batch to plus-one as the model requires more memory bar.

    “This is something that people do not really see, they are like, ‘Oh, I am the maximum of the batch shape,” but it really comes down to Twitch all these different things, and suddenly it is super skilled, but it only works in your specific context, “LuCioni explained.

    4. Encourage energy transparency

    It always helps when people are encouraged; By this end, the beginning of this year began to embrace AI Energy ScoreThis is a novel way to promote more energy efficiency, using a 5-star rating system, with the most efficient models “five-stars” the position.

    This can be considered as an “energy star for AI”, and was potentially inspired by a smooth-to-dift federal program, which sets energy efficiency specifications and branded qualifying devices with an energy star logo.

    “For a few decades, it was really a positive inspiration, people wanted the star rating, okay?” Lucioni said. “Something similar will be very good with energy score.”

    Hugs is a face is one Leaderboard now upWhich is planning to update with new models (Dipsek, GPT-OS) in September, and new models are available as well as do so every 6 months or soon. The goal is that model builders consider the rating to be the “badge of respect”, Lucian said.

    5. “More calculation is better” reconsider the mindset

    Instead of chasing the largest GPU groups, start with the question: “What is the clever way to get the result?” For many workloads, clever architecture and better-unumed data outper scaling.

    “I think people probably don’t need as much GPU as they think they do,” Lusconi said. Instead of going to the largest groups only, they urged enterprises to re -prepare the tasks, the GPU would be completed and why they would need them, how they did those types of tasks, and will eventually get them to add additional GPU.

    “It’s like this race, where we need a large cluster,” he said. “Wondering what you are using for AI, what technique do you need, what do you need?”

    Daily insights on business use cases with VB daily

    If you want to impress your boss, VB daily has covered you. We give you the scoop inside what companies are doing with generative AI, from regulatory changes to practical deployment, so you can share insight for maximum ROI.

    Read our privacy policy

    Thanks for membership. See more VB newsletters here.

    There was an error.

    Hugging Face: 5 methods can reduce AI cost without renouncing enterprises performance

    Cost enterprises Face Hugging methods performance reduce renouncing
    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleGPT -5 is now friendly – but not everyone likes it. here’s why
    Next Article Why My Favorite MacBook Pro option is Windows Laptop with a striking design
    PineapplesUpdate
    • Website

    Related Posts

    AI/ML

    Apple’s new chatbot allegedly rolls ahead of iPhone 17 – but it’s not for you

    September 3, 2025
    AI/ML

    Amazon launched Lens Live, which is an AI-managed shopping tool for use in the real world

    September 2, 2025
    AI/ML

    The man tracked his stolen goods with an Airtag – and found himself in a bizarre scene

    September 2, 2025
    Add A Comment
    Leave A Reply Cancel Reply

    Top Posts

    Microsoft’s new text editor is a VIM and Nano option

    May 19, 2025797 Views

    The best luxury car for buyers for the first time in 2025

    May 19, 2025724 Views

    Massives Datenleck in Cloud-Spichenn | CSO online

    May 19, 2025650 Views
    Stay In Touch
    • Facebook
    • YouTube
    • TikTok
    • WhatsApp
    • Twitter
    • Instagram
    Latest Reviews

    Subscribe to Updates

    Get the latest tech news from FooBar about tech, design and biz.

    Most Popular

    10,000 steps or Japanese walk? We ask experts if you should walk ahead or fast

    June 16, 20250 Views

    FIFA Club World Cup Soccer: Stream Palmirus vs. Porto lives from anywhere

    June 16, 20250 Views

    What do chatbott is careful about punctuation? I tested it with chat, Gemini and Cloud

    June 16, 20250 Views
    Our Picks

    Warning: Flaws in Copland OT controllers can be leveraged by danger actors

    September 3, 2025

    In 18 months, my iPhone’s battery life has become terrible from great

    September 3, 2025

    Claudflare stopped the new world’s largest DDOS attack on Labor Day Weekend

    September 3, 2025

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    Facebook X (Twitter) Instagram Pinterest
    • About Us
    • Contact Us
    • Privacy Policy
    • Terms And Conditions
    • Disclaimer
    © 2025 PineapplesUpdate. Designed by Pro.

    Type above and press Enter to search. Press Esc to cancel.