• Sun. Apr 14th, 2024

Replicating a Human Voice in Any Language with Artificial Intelligence in 15 Seconds

BySamantha Johnson

Mar 31, 2024
Replicating a Human Voice in Any Language with Artificial Intelligence in 15 Seconds

OpenAI has introduced a new platform known as Voice Engine, which is set to transform the field of speech synthesis. This tool has the capability of creating a synthetic voice from a brief 15-second audio sample of an individual, allowing for the reading of texts in the original language of the sample or in other languages. In order to evaluate both the positive applications and necessary security measures, OpenAI has released limited access to this technology, partnering with various companies across different sectors.

Some of the partners who have had the opportunity to test Voice Engine include Age of Learning, HeyGen, Dimagi, Livox, and the Lifespan Health System. Through these collaborations, the practical applications of the technology have been explored, such as creating pre-scripted speech content and providing real-time personalized responses for students using GPT-4.

Lead by Jeff Harris, a member of OpenAI’s product team, the development of Voice Engine began in late 2022. The technology utilizes licensed and publicly available data to power the text-to-speech API’s pre-built voices and the Read Aloud feature of ChatGPT. Access to Voice Engine will be limited to around ten developers, demonstrating OpenAI’s caution in introducing this technology.

The field of text-to-audio generation, particularly AI-based voice cloning, is rapidly evolving with companies like Podcastle and ElevenLabs leading the way with their innovations. However, this increased interest is met with ethical and security concerns regarding the improper use of the technology, as seen with the recent ban from the US Federal Communications Commission on automated calls using cloned AI voices without consent.

To address these concerns, OpenAI has implemented strict usage policies for its partners. These policies include the prohibition of impersonating individuals or organizations without consent, the requirement to obtain explicit and informed consent from the original speaker, and the commitment to not allow users to create their own entries. Additionally, all audio clips generated will carry a watermark for traceability, and the use of synthetic voice will be closely monitored.

In response to potential risks, OpenAI suggests several preventative measures, such as eliminating voice authentication for accessing bank accounts, policies safeguarding the use of people’s voices in AI, increased education on deepfake technology, and the development of AI content tracking systems for enhanced security.

By Samantha Johnson

As a content writer at newszkz.com, I delve into the realms of storytelling, blending words to paint vivid narratives that captivate and inform our readers. With a keen eye for detail and a passion for research, I craft compelling articles that resonate with our audience. My love for words drives me to explore diverse topics, ensuring that each piece I create not only educates but also entertains. Join me on this journey as we navigate the ever-evolving landscapes of news and knowledge together.

Leave a Reply