OpenAI's new voice cloning tool raises ethical concerns

01/04/2024

Arab Times

01/04/2024

Josephine

Ethical dilemma: OpenAI's Voice Engine raises red flags.

NEW YORK, April 1: OpenAI, a leading artificial intelligence research lab, has developed a groundbreaking tool named Voice Engine capable of replicating anyone's voice with startling accuracy using only 15 seconds of recorded audio. However, amidst concerns over potential misuse and misinformation, the AI lab has opted against a general release of the technology.

Initially developed in 2022 and utilized for the text-to-speech feature in ChatGPT, OpenAI's flagship AI tool, Voice Engine's capabilities have largely remained undisclosed to the public. This decision reflects the cautious approach OpenAI is taking to prevent the proliferation of damaging misinformation, particularly in light of global elections.

In an unsigned blog post, OpenAI expressed a desire to initiate discussions regarding the responsible implementation of synthetic voices and how society can adapt to this transformative technology. The lab aims to gather insights from small-scale tests and dialogues before making a definitive decision on the widespread deployment of Voice Engine.

Despite not releasing the tool broadly, OpenAI shared examples of its real-world applications. Age of Learning, an education technology firm, employs Voice Engine to generate scripted voiceovers, while HeyGen, an "AI visual storytelling" app, enables users to create translations of recorded content while retaining the original speaker's accent and voice.

Moreover, researchers at the Norman Prince Neurosciences Institute successfully utilized Voice Engine to recreate the voice of a young woman who had lost her speech due to a brain tumor, underscoring the tool's potential for positive applications.

OpenAI emphasized the importance of societal resilience against the challenges posed by increasingly convincing generative models. As part of this effort, the lab advocated for phasing out voice-based authentication for sensitive information access and proposed policies to safeguard individuals' voices in AI applications.

Furthermore, OpenAI disclosed that Voice Engine-generated audio is watermarked for traceability, and partnerships with developers mandate explicit consent from original speakers. However, while OpenAI's tool stands out for its technical simplicity, competitors like ElevenLabs offer similar capabilities to the public, albeit with longer audio requirements and safeguards against misuse.

As the debate over AI ethics and regulation intensifies, OpenAI's deliberative approach underscores the necessity of balancing technological innovation with ethical considerations and societal safeguards.