Join our daily and weekly newspapers for exclusive content on the latest updates and industry-composure AI coverage. learn more
New York based AI Startup Hume has unveiled its latest Empathic Voice Interface (EVI) converted AI modelThe EVI 3 (like “EVE” three, Pokémon character), customer support systems and health coaching to immersive storytelling and health coaching for virtual partner targets everything.
EVI 3 allows users to make their voices by talking to the model (it is voice-to-voice/speech-to-speech), and its purpose is to set a new standard for naturalness, expression and “sympathy” according to the Hume-ie, the user feels the ability to show the ability to adjust to the model of their emotions and to adjust their own reactions, or show their own reactions, and to adjust their own reactions, to show their own reactions and to adjust their own reactions, Are in terms of option.
Designed for businesses, developers and creators, the EVI 3 expands over the previous voice model of Hume by offering more sophisticated adaptation, rapid reactions and increased emotional understanding.
Individual users can interact with it today Hume’s live demo on your website And iOS app, but Hume’s proprietary app is made available in “coming weeks” through the proprietary application programming interface (API). Blog post from company States.
At that point, the developers will be able to embed the EVI 3 in their own customer service systems, creative projects, or virtual assistants – for a price (see below).
My own use of the demo allowed me to create a new, custom synthetic voice in seconds based on those qualities that I have described it – a mixture of warm and confidence, and a masculine tone. Talking about it, he felt more natural and easy than other AI models and certainly with legacy tech leaders such as Siri and Amazon Alexa with stock voice such as apples.
KDevelopers and businesses should be aware of EVI3
Hume’s EVI 3 is designed for a range of content manufacturing and use of in-app interactions in Audiobook and Gaming.
This allows users to specify accurate personality symptoms, vocal properties, emotional tone and conversation themes.
This means that it can produce anything from a hot, sympathetic guide to a bizarre, mischievous storyteller, such as “a screaming mouse in a French accent about its plan to steal cheese from the kitchen.”
The main power of EVI 3 lies in its ability to integrate emotional intelligence directly into voice-based experiences.
Unlike traditional chatbots or voice assistants, who rely very much on scripted or text-based interactions, how EVI 3 naturally speaks-on pitch, processodi, Raks and vocal explosion to make more attractive, human conversations.
However, a large feature is currently decreased in the model of Hume – and which is introduced by the open source and ownership of rivals, such as the XIbabs – Voice Cloning, or a rapid replica of the user or other voice, such as the company’s CEO.
Nevertheless, Hume has indicated that it will add such a capacity to its Octave Text-to-Spitch model, as it has been noted as “soon” on the Hume’s website, and the company truly pre-reporting on the company found that it would allow users to repeat the sounds as five seconds of the audio.
Hume has said that it is giving priority to safety measures and moral ideas before providing this facility widely. Currently, this cloning capacity is not available only in EVI, emphasis has been laid on flexible voice optimization instead of Hume.
Internal benchmarks show that users prefer EVI 3 to GPT-4O voice model of Openai
According to Hume’s own tests with 1,720 users, the EVI3 was preferred in each category at GPT -4O of OpenAII: Naturalism, expression, sympathy, interruption, response speed, audio quality, audio quality, voice feeling/style modulation on the request, and understanding on requests (requests “.
It usually makes Google’s Gemini Model Family and New Open Source Best AI model firm mole From Brendon Ib, the co-producer of former Okulus.



It claims low delay (~ 300 milliseconds), strong multilingual support (English and Spanish, coming with more languages), and effectively unlimited custom voices. As Hume writes on his website (see screenshot below):

Major capabilities include:
- Prosecutor production And expressive text-to-speech with modulation.
- barrierTo enable dynamic condensed flow.
- In-increase voice customizationTherefore users can adjust the style of speaking in real time.
- AP-Reddy Architecture (Coming soon), so developers can integrate EVI 3 directly into apps and services.
Pricing and developer access
Hume provides flexible, use-based pricing in its EVI, Octave TTS and expression measurement API.
While the specific API pricing of EVI 3 has not been announced yet (marked as TBA), patterns show that it will be used-based, in which enterprise discounts are available for large deployment.
For reference, EVI2 is priced at $ 0.072 per minute – 30% lower than its predecessor, EVI 1 ($ 0.102/min).
For creators and developers working with Text-to-Spin Projects, Hume’s octas TTS plans range from a free tier (10,000 characters of speech, ~ 10 minutes audio) to enterprise-level plan. Here’s the breakdown:
- Free: 10,000 characters, unlimited custom sounds, $ 0/month
- Starter: 30,000 characters (~ 30 minutes), 20 projects, $ 3/month
- Manufacturer: 100,000 characters (~ 100 min), 1,000 projects, usage-based oversees ($ 0.20/1,000 characters), $ 10/month
- Furore: 500,000 characters (~ 500 min), 3,000 projects, $ 0.15/1,000 additional, $ 50/month
- scale: 2,000,000 characters (~ 2,000 min), 10,000 projects, $ 0.13/1,000 additional, $ 150/month
- Business: 10,000,000 characters (~ 10,000 min), 20,000 projects, $ 0.10/1,000 additional, $ 900/month
- Venture: Custom pricing and unlimited use
For developers working on real-time voice interaction or emotional analysis, Hume also offers a pay as you plan with $ 20 in free credit and have no upfront commitment. High-volume enterprise customers may opt for a dedicated enterprise plan characterized by dataset license, on-primeses solutions, custom integration and advanced support.
History of Hume AI Voice Model History
Established in 2021 by Alan Cowen, a former researcher at Google Deepmind, Hume is aimed at the reversal of human emotional nuances and AI interaction.
The company trained its models on an extension dataset drawn from hundreds of thousands of participants worldwide – not only speeches and texts, but also vocal bursting and facial expressions.
“Emotional intelligence includes the ability to estimate intentions and preferences from behavior. AI is trying to achieve AI interface,” Coven told Venturebeat. Hume’s mission AI interface is to be more responsible, human, and eventually more useful – whether it is helping the customer navigating an app or describing a story with a correct mixture of drama and humor.
In early 2024, the company launched the EVI2, with new features such as dynamic voice customization and in-contression style prompts, 40% lower delay and 30% lower pricing than EVI 1.
The Octave started in February 2025, which was a text-to-speech engine for the material creators, which was able to accommodate emotions at the sentence level with signals of lessons.
With EVI 3, now is available for hands-on exploration and complete API access around the corner, Hume allowed developers and creators to re-explain what is possible with Voice AI.