OpenAI, a company funded by Microsoft, has announced its latest venture into the realm of artificial intelligence (AI) with the “Voice Engine” platform. This new AI engine has the capability to recreate human voices, marking a significant leap in the field of sound and speech technology.
Following the success of ChatGPT, DALL-E, and Sora, OpenAI has expanded its focus to include human speech and voice. The Voice Engine AI, currently in the beta testing phase, can recreate multilingual human voices from just 15-second recordings. However, it’s worth noting that the platform might not be available for public subscription just yet.
We're sharing our learnings from a small-scale preview of Voice Engine, a model which uses text input and a single 15-second audio sample to generate natural-sounding speech that closely resembles the original speaker. https://t.co/yLsfGaVtrZ
— OpenAI (@OpenAI) March 29, 2024
The New York Times reportedly received a demo of the Voice Engine platform. The AI engine can recreate a human’s voice from a 15-second recording and, once it has absorbed the necessary information, can speak any text prompt in the recreated voice.
#OpenAI new voice engine helps patient who lost speech due to brain tumor speak fluently again. @OpenskiesX pic.twitter.com/YB0mjKJYew
— RameshR (@rezmeram) March 29, 2024
Interestingly, the text prompt need not be in the native language of the speaker whose voice was used to train the AI engine. This means that a native English speaker could potentially speak in Spanish, French, Chinese, or many other languages through Voice Engine.
Given the potential pitfalls of such technology, OpenAI has assured that it is exploring multiple safety checks, such as watermarks, and controls that restrict Voice Engine from creating the voices of certain individuals. OpenAI product manager, Jeff Harris, has reportedly claimed that the company doesn’t have any immediate plans to monetize the technology. Instead, the primary purpose is to be useful to people who have lost their voices due to illness or accident.
Currently, Voice Engine is available to a small group of businesses, presumably by invitation, indicating limited access to the platform. This restriction is understandable considering the huge ethical and legal implications of an AI platform that can recreate human voices in multiple languages based on a 15-second recording.
OpenAI publicly announces their Voice Engine, which allows voice cloning from 15 seconds of audio.https://t.co/zMRViqN5f5
Originally developed in late 2022, they have tested it with a variety of trusted partners. Some demo samples are shared in the blog post. They have no… pic.twitter.com/aGK0ghwlsv
— Tanishq Mathew Abraham, Ph.D. (@iScienceLuvr) March 29, 2024
AI has evolved into a behemoth, with the ability to create convincing deep fake images, videos, and now, voices. In the wrong hands, Voice Engine could accept and recreate the voices of politicians, celebrities, journalists, and other prominent personalities. This could lead to the creation of convincing audio clips that spread misinformation or propaganda. In a more alarming scenario, hackers and criminals could compromise security systems that rely on voice authentication.
Despite the safety and ethical concerns, OpenAI’s Voice Engine could be tremendously helpful in various fields. Film and web-series producers who need to dub their creations into other languages could greatly benefit from this technology. Similarly, the educational and entertainment fields could also see immense benefits.
OpenAI just launched Voice Engine,
It uses text input and a single 15-second audio sample to generate natural-sounding speech that closely resembles the original speaker.
Reference and Generated audio is very close and hard to differentiate.
More details in 🧵 pic.twitter.com/tJRrCO2WZP— AshutoshShrivastava (@ai_for_success) March 29, 2024
In conclusion, while the Voice Engine platform holds great promise, it also underscores the need for stringent safety checks and regulations to prevent misuse. As AI continues to evolve, it’s crucial that we navigate its development responsibly, ensuring that its benefits are harnessed while mitigating potential risks.
You may also like:- Google vs. Oracle – The Epic Copyright Battle That Shaped the Tech World
- Introducing ChatGPT Search – Your New Gateway to Instant, Up-to-date Information
- Python Has Surpassed JavaScript as the No. 1 Language on GitHub
- [Solution] Missing logstash-plain.log File in Logstash
- Top 7 Essential Tips for a Successful Website
- Sample OSINT Questions for Investigations on Corporations and Individuals
- Top 10 Most Encryption Related Key Terms
- Top 10 Key Guidelines For Designing A Robust Web Application
- The Rise of Online Shopping – Convenience, Risks, and Safety Measures
- WiFi Suspended at Major UK Train Stations Following Cybersecurity Incident