AI companies have provided a lot of applications for the technology such as image and text generation. OpenAI is close to releasing a tool that can clone one's voice as well, but the probability of it being misused has contributed to the company's reluctance to release it.
Potential Misuse
We already have readily available voice generation tools that can transform text to speech in just one click, even certain models that recreate someone's voice. Given that OpenAI has already landed in hot waters because of its AI tools, it's expected for the company to be more cautious.
OpenAI just announced its voice-cloning tech called Voice Engine, which is capable of mimicking someone's voice with just a 15-second sample. That voice can then be used to generate audio through a text-to-speech process.
Right then and there, it's already easy how the tool can be used for the wrong reasons. Anyone can easily get snippets of someone's recordings and make them "say" things they never would. With that said, the AI company will be slowing down on its release.
"In line with our approach to AI safety and our voluntary commitments, we are choosing to preview but not widely release this technology at this time," the company stated. OpenAI mentioned its hope that the preview of the tool underscores its potential.
Furthermore, the AI giant also expressed that the preview intended to motivate the "need to bolster societal resilience against the challenges brought by ever more convincing generative models," as reported by Ars Technica.
The tool's potential for misuse is particularly worrying, especially since well-known individuals and powerful figures have countless videos online of them speaking, which can easily be used as a voice sample for cloning.
Putting It to Good Use
Even with the potential ramifications of the tool being misused, it also has a lot of applications that can benefit many people in many ways. OpenAI listed a couple of uses for the Voice Engine, all of which are considered early applications.
For one, it can be used to provide reading assistance to non-readers and children, offering a more natural-sounding voice. GPT-4 can even be used to create personalized responses to students in real-time.
It could also be used for translating content. A lot of AI models already offer live translation, such as Samsung's Galaxy AI, which is being applied to calls and texts. Users of Voice Engine can translate statements into different languages in their own voice.
Voice Engine can also serve as a substitute for one's actual voice, especially for those who are non-verbal or have suffered from degenerative speech conditions. Certain clinical contexts are already exploring the potential use of such AI tools.
There are still a lot of kinks to work out before the tool can be used safely. For instance, there's still the matter of voice authentication as a security measure. There are also policies to be drafted and implemented to make sure that a person's voice is protected from being used without their consent.