Gmail’s spam filters can now understand “adversarial text manipulations.”
The latest post on the Google Security blog details a new upgrade to Gmail’s spam filters that Google is calling “one of the largest defense upgrades in recent years.” The upgrade comes in the form of a new text classification system called RETVec (Resilient & Efficient Text Vectorizer). Google says this can help understand “adversarial text manipulations”—these are emails full of special characters, emojis, typos, and other junk characters that previously were legible by humans but not easily understandable by machines. Previously, spam emails full of special characters made it through Gmail’s defenses easily.
If you want an example of what “adversarial text manipulation” looks like, the below message is something from my spam folder. My personal Gmail experience with these emails is that they used to be a major problem during the first half of the year, with emails like this regularly landing in my inbox. It does seem like this RETVec tech upgrade works, though, because emails like this haven’t been a problem at all for me in the last few months.
Emails like this have been so difficult to classify because, while any spam filter could probably swat down an email that says, “Congratulations! A balance of $1,000 is available for your jackpot account,” that’s not what this email actually says. A big portion of the letters here are “homoglyphs“—by diving into the endless depths of the Unicode standard, you can find obscure characters that look like they’re part of the normal Latin alphabet but actually aren’t.
For instance, the subject “𝐂𝐡𝐞𝐜𝐤_𝐘𝐨𝐮𝐫_𝐀𝐜𝐜𝐨𝐮𝐧𝐭” is weirdly bolded not because it has bolded styling but because it uses Unicode glyphs like the “Mathematical Bold Capital C.” It’s a math symbol that happens to look like the letter “C” to people, but the robot doing spam filtering accurately views it as a math symbol and doesn’t understand the intended English meaning. The closer you look at an email like this, the worse it gets: “C0NGRATULATIONS” has a zero replacing one of the “O” characters, the underlined letters in “Jᴀ̲ᴄ̲ᴋ̲pot” are so strange they don’t even come up in Unicode searches, and a lot of spaces are swapped out for periods or underscores. The result is that a spam filter looks at this hot mess of an email and basically gives up. (I don’t understand why illegible emails default to “inbox” instead of “spam,” but I’m not in charge.)Advertisement
Google says RETVec is here to save the day: “RETVec is trained to be resilient against character-level manipulations including insertion, deletion, typos, homoglyphs, LEET substitution, and more. The RETVec model is trained on top of a novel character encoder which can encode all UTF-8 characters and words efficiently. Thus, RETVec works out-of-the-box on over 100 languages without the need for a lookup table or fixed vocabulary size.”
Google says the efficiency here is a big deal. Alternative approaches that used a “fixed vocabulary size” or “lookup table” for homoglyphs made them resource-intensive to run. Imagine a list of every possible spelling and misspelling of “congratulations” that swaps out one or more characters for numbers, math symbols, Cyrillic, Hebrew, or emojis, and you have a nearly endless list. Google says RETVec is only 200,000 “instead of millions of parameters,” so while Google’s spam-filtering cloud is probably big enough to run anything, this is small enough that it could even run on a local device. RETVec is open source, and Google hopes it will rid the world of homoglyph attacks, so even your local comment section could be running it someday.
RETVec appears to work a lot like how humans read: It’s a machine-learning TensorFlow model that uses visual “similarity” to identify what words mean instead of their actual character content. Google’s similarity demo uses the same technology to identify pictures of cats, so turning that into the world’s fanciest optical character recognition system sounds pretty doable. Apparently, this approach has led to big improvements, with Google saying: “Replacing the Gmail spam classifier’s previous text vectorizer with RETVec allowed us to improve the spam detection rate over the baseline by 38% and reduce the false positive rate by 19.4%. Additionally, using RETVec reduced the TPU usage of the model by 83%, making the RETVec deployment one of the largest defense upgrades in recent years.”
Google says it has been testing RETVec internally “for the past year,” and it has already rolled out to your Gmail account.