Greg Tarr. PHOTO: BT Young Scientists.

Teenage Coder Gives Facebook A Lesson On Deepfake Detection

Greg Tarr’s work has taken on a life of its own

David Braue

Melbourne, Australia – Mar. 24, 2021

At the ripe old age of 18, Greg Tarr has already learned the art of humility: he only wrote 10,000 lines of code to win one of the UK’s most prestigious science prizes, he points out, not 150,000 as reported — although that figure might be closer to the truth if you add in all the other libraries he incorporated.

Yet whatever the number of lines, it was their contents that helped Tarr — who got his first computer at the age of five and started making his own video games at seven “because that sounded really fun” — beat out hundreds of other contenders to recently be named BT Young Scientist & Technologist 2021.

That award, which he earned for his innovative work in “detecting state-of-the-art deepfakes” (video here), was given in recognition of his efforts to analyze and improve upon the five winning solutions to Facebook’s first Deepfake Detection Challenge last year.

Competing to be the best at spotting deepfake videos, the winners of that challenge only reached a maximum of 65.18 percent accuracy — low enough that Facebook ultimately declared automatic deepfake detection to be an “unsolved problem.”

Cybercrime Radio: Greg Tarr On Detecting Deepfake Videos

BT’s 2020 Young Scientist & Technologist

Cybercrime Radio

Yet by casting a careful eye over their code, Tarr began picking out areas where the code was taking computationally inefficient paths towards its ultimate solution — and his research project was on.

“I took the top five submissions, did a whole bunch of analysis in terms of timing certain components and seeing where they were going wrong, and what was taking up the majority of their time,” he told Cybercrime Magazine.

“What I noticed was that all of them were super, super slow — mostly because the way the competition was structured was that you were rewarded for accuracy, not efficiency. So, I refocused my approach in trying to make it as accurate as possible, while being as fast as possible so that it can actually be used in real-world scenarios.”

In this case, “as fast as possible” meant delivering speed improvements of 10x — optimizing and consolidating the previous winning solutions to deliver a way of scanning videos to determine whether they’re deepfakes, and doing it quickly enough to be both accurate and practical.

He’s keeping the details under his hat, admitting only that such dramatic speed improvements required adding a “novel” face detector, tapping a few programming concepts like ensembling — and, he added, applying “a whole bunch of concepts I’d rather not explain.”

Sifting data at the speed of life

He’s not just being coy: Tarr’s work has taken on such a life of its own that he has recently dropped out of school and begun “pursuing commercial interests.”

Now working in stealth mode, he expects the results of his fledgling “AI infrastructure company,” which he is developing to attract third-party funding, to become clearer in a few months — but he will say that he has been thinking a lot about the way deepfakes risk becoming a facilitator of identity theft.

Noting that a cybercriminal recently used voice deepfakes to scam a British CEO out of $243,000, Tarr said the ever-improving quality of deepfakes — and increasingly easy access to them by members of the general public — meant that we were approaching an equilibrium “where we cannot detect deepfakes, and that is something we are going to have to get used to.”

“At that point,” he added, “we’re going to need to mature as a society rather than getting the technology matured, because there is a limit to that. We need to be less reliant on the information that we consume from unreliable sources.”

Recognizing that may be a bridge too far for the moment, however, Tarr sees promise in his efforts to speed machine learning-based deepfake recognition, and to dramatically scale it to support large-scale applications.

“Regular members of the public are being affected,” he explained, “and that’s why you need to be able to scan the entire internet — and why efficiency is such a big problem.”

“AI, which is pretty much the only way to detect these deepfakes, is a really computationally expensive task,” Tarr said. “We’re talking about thousands and thousands of servers that you need to do this — so if you’re able to get even a one percent or two percent improvement in speed, it’s worth it — because that saves millions of dollars.”

– David Braue is an award-winning technology writer based in Melbourne, Australia.

Go here to read all of David’s Cybercrime Magazine articles.