Why robocalls are about to get more dangerous

The problem of unsolicited robocalls has gotten so bad that many people now refuse to pick up calls from numbers they don’t know. It’s become a defense of last resort in an increasingly frustrating situation that’s led to nearly 25 million Americans becoming victims of fraud. If only it were that simple to solve.

By next year, it’s estimated that half of the calls we receive will be scams, but even more worrisome, 90% of those calls will be “spoofed” — falsely appearing as if they’re coming from a familiar number in your contact book.

The government is finally waking up to the severity of the issue by funding and developing a suite of tools, apps and approaches intended to prevent the scammers from getting through.

Unfortunately, it’s too little too late. By the time these “solutions” become widely available, scammers will have moved onto radically more sophisticated tactics. In the near future, it’s not just going to be the number you see on your screen that will be in doubt. You will soon also question whether the voice you’re hearing is actually real.

That’s because there is a series of powerful voice manipulation, impersonation and automation technologies that are about to become widely available for anyone to use. Gone are the robotic-sounding voice changers of yesterday. With machine learning, software can now understand and mimic the intonations, speaking style and emotions we use in daily conversation.

And we may already be past the point where we are able to tell whether there’s a human being or a bot on the other end of the phone.

At this year’s Google’s I/O Conference, the company demonstrated a new voice technology able to produce such a convincing human-sounding voice, it was able to speak to a receptionist and book a reservation without detection. Then we saw Buzzfeed reporter Charlie Warzel use a free program called Lyrebird to create an “avatar” of his voice by reading phrases into a program for an hour that was good enough to fool his own mother.

As these systems collect more data and evolve, they require fewer and shorter audio clips in order to make believable replicas.

Take Chinese tech giant Baidu’s progress in developing its text-to-speech technology named DeepVoice, for example. When the first version was released in early 2017, it was capable of assembling short sentences that sounded quite realistic, but it required hours of recordings and could only process a single voice. Two releases later, the software is now capable of processing thousands of different voices and requires only 30 minutes of training data.

These developments threaten to make our current frustrations with robocalls much worse. The reason that robocalls are a thorny issue has less to do with volume than