Voice cloning is when a computer program is used to generate a synthetic, adaptable copy of a person’s voice.
When Tim Heller first heard his cloned voice he says it was so accurate that “my jaw hit the floor… it was mind-blowing”. From a recording of someone talking, the software is able to then replicate his or her voice speaking any words or sentences that you type into a keyboard.
Such have been the recent advances in the technology that the computer generated audio is now said to be unnervingly exact. The software can pick up not just your accent – but your timbre, pitch, pace, flow of speaking and your breathing.
Mr Heller, a 29-year-old voiceover artist and actor from Texas, does everything from portraying cartoon characters, narrating audio books and documentaries, speaking on video games, and the voiceovers on film trailers.
He says he recently turned to voice cloning to “future proof” his career. He says it may enable him to secure more work. For example, if he is ever double-booked, he could offer to send his voice clone to do one of the jobs instead.
“If I am booked for other work… I can position my ‘dub’ [what he calls his voice clone] as an option that can save clients time, and generate passive income for myself,” says Mr Heller. To get his voice cloned Mr Heller went to a Boston-based business called VocaliD – one of a growing number of companies now offering the service.
Prof Patel set up the business in 2014 as an extension of her clinical work creating artificial voices for patients who are unable to talk without assistance, such as people who have lost their voice following surgery or illness.
She says that the technology – which is led by artificial intelligence, software that can “learn” and adapt by itself – has advanced greatly over the past few years. This has caught the attention of voiceover artists.
Voice cloning can also be used to translate an actor’s words into different languages, thereby potentially meaning, for example, that US film production companies will no longer need to hire additional actors to make dubbed versions of their movies for overseas distribution.
Yet while the increasing sophistication of voice cloning has obvious commercial potential, it has also led to growing concerns that it could be used in cybercrime – to trick people that someone else is talking.
Together with computer-generated fake videos, voice cloning is also called “deepfake”. And cyber security expert expert Eddy Bobritsky says there is a “huge security risk” that comes with the synthetic voices.
Mr Bobritsky says that is now changing. “For example, if a boss phones an employee asking for sensitive information, and the employee recognises the voice, the immediate response is to do as asked. It’s a path for a lot of cybercrimes.”
In fact, such a case was reported by the Wall Street Journal back in 2019, with a UK manager said to have been tricked into transferring €220,000 ($260,000; £190,000) to fraudsters who used a cloned copy of the voice of his German boss.
“Steps to deal with this new technology and the threats it brings with it need to be made,” adds Mr Bobritsky. Firms around the world are in fact already doing this, as specialist artificial intelligence news website Venture Beat has reported.
Such companies can monitor audio to see if it is fake, looking for tell-tale signs like repetition, digital noise, and the use of certain phrases or words.