
Key takeaways of zdnet
- There are now many AI equipment available that can generate human speech.
- Some AI voices can now be whispering, laughing, and others may do tricks.
- TTS equipment differ in their levels of realism and their levels of their intended audience.
Synthetic sounds generated by artificial intelligence become normal, for better or worse. Meanwhile, the number of companies developing this technology is increasing rapidly.
The AI makes the backbone of recent innovations, such as transformer architecture-many generative AI tools, including large language models, generic advercerial networks (GANS), and spreading models that have led to the rise of the AI system that can convert text signals into natural-rich artificial speech. Now a wide variety of these text-to-speech (TTS) systems is available, each of which is with its special benefits and shortcomings.
To gain a clear understanding, which is the most advanced, I currently tested three of the most popular free TTS devices on the market.
Elevenlabs
https://www.youtube.com/watch?v=otz1ffg5-3W
Elevenlabs is widely considered the leader of an industry in Voice Realism, and I found it a proper accurate assessment in my own experiments with the company’s TTS tools. But this realism feels more closely aligned with the voice of a trained voice actor or professional podcaster, as it does with ordinary human conversation – it’s almost a little bit Very Polish. In this sense, however, it is a favorite option for many businesses and professionals in search of reliable automatic statement. It also supports more than 20 languages, which expands the access and appeal of the platform.
The company also released a new text-to-spit model called V3 as a research preview last month. It supports more than 70 languages, and users can spices their AI-borne dialogue with audio tags that can cause it to laugh, sigh, or whisper it, just to name a few examples.
Also: The new AI Voice Assistant of Elevenlabs can automate your favorite functions – and you can try it for free
You can sign up for a free account ElevenlabsAnd you will automatically get 10,000 free credit. Select the “Text to Speech” option under the “Playground” in the left -handed menu, and you will be redirected to a page where you can enter a custom prompt that you want to tell the AI system, select from a range of custom voice, and adjust the parameters such as speed and stability. Prompts are limited to 5,000 characters, and every character uses the same credit in each recurrence of the voice generation.
Hume AI
https://www.youtube.com/watch?v=clhsd8fucq8
Hume AIThe TTS model is another contender for the most realistic voice-generating tool. The company has deployed its ownership Empathic Voice Interface (EVI) as an AI system that can capture and imitates the subtleties of human speech, imbuing it with a deeper layer of reliability. Like Elevenlabs, Hume offered a comprehensive set of premedy AI voice characters, with its own expressive quirks. You can also generate a custom voice by describing them in the signs of natural language.
To test it, I tried my best to describe the voice of Samwaiz Gumge from “The Lord of the Rings” as I was depicted in films by Sean Estin. My indication: “Gentle but brave hobbit, a working-class, the western country with the British-a sign of a welth-rich. He should be frightened but resolved to fulfill his mission.”
Too: This new text-to-speech AI model considers what it is saying-how to try it for free
When I inspired it to call it a famous line from the film, “If I take another step, it would be far from home,” it produces three samples, which vary to tone and loud. They were all impressive; For my ear, they consisted of a degree of realism and emotional depth that is not replicated by its rivals. They did not look like Estin’s Sam, but it was undoubtedly a reflection of incomplete incomplete details used as a sign.
You can also stop black pepper by adding “(poses)” to your prompt, or add a slang infusion such as “Y’Al” to increase the reliability of your custom sounds.
Description
If you are looking for an AI voice-generating tool, which offers a range of editing facilities, Description There is one to choose.
The company’s TTS model generates audio files in a wage format, which you can edit in the Adobe Audition or a similar platform in the same way. You can choose from a library of Premad AI Voice or submit a small recording of your own voice, and the system will clone it for you.
I tested the voice-cloning feature by asking the system to read a small prompt: “Summer is getting cruel in the city of New York, and I need to invest in a more high quality air conditioning.” (What is true.) For the first time, the AI-rendered version of my voice certainly looked like me, but there was also a mechanical quality that had separated from realism.
I decided to give it another attempt and recording my voice again, this time to take off your Bluetooth headphone and read the script slowly and deliberately. This time the results were very realistic-a more solid simulation of my voice, in my opinion, compared to a similar voice-cloning feature offered by Hume.
Also: I spoke with an AI version of myself, thank you for the free equipment of Hume – how to try it
You can also adjust each piece of AI-based audio by editing your written prompt directly. This was not correct, of course; My close friends and family members will probably be able to look at the difference, but it will probably fool my farthest acquaintances. I can easily imagine using equipment for my own articles or some similar use case.
Podcasters and other materials are looking to polish their audio recording quickly for the creators, the descript also provides an AI feature that identifies the filler words, unnecessary stagnation, “UMMS” and “UHHS,” and other unwanted bits of the audio.
ZDNET advice
It is important to note that these are only three of a large number of a large number of currently available TTS models, and that each user will have its own professional role, tech sevnes, budget, and their priorities on the same basis. Before you choose a platform and walk with it, spend a few minutes in playing with different options, to see which user interface feels the most comfortable and provides which features that closely closely align with your creative goals. Also remember that services differ in using your data.
Also: Text-to-speech with felt-This new AI model does everything but sheds a tear
Regardless of which platform you can use, keep your eyes on the speed on which this technique is developing. Very soon, we may be living in a world full of AI’s voice – and some of them can make sounds like their own.
Want more stories about AI? Check out AI LeaderboardOur weekly newspapers.

