AI and Protecting against the Next Generation of Privacy Threats

  Shuman Ghosemajumder


  Jason Feldt
Published January 28, 2021

Protecting the privacy of your own data in 2021 is hard and seemingly only getting harder. For most people, learning to use new technology isn’t easy, and adapting to continual changes in products and services makes it more complex. With rapidly evolving security and privacy risks, there are always more privacy threats than individual consumers can track.

You are likely very familiar with the most common threats that put your data at risk, such as using insecure passwords, sharing passwords, and not installing required security updates on your devices. Poor privacy practices and inadequate security in online applications pose a further risk, regardless of the actions you take to protect yourself.

But there are many other privacy risks that go beyond these typical cases. Here are some examples of more subtle privacy threats that you may not be thinking about regularly:

Geotagging. When you share a photo from your smartphone, what are you actually sharing? You might not realize it, but smartphones record detailed metadata about your device and image in each JPEG file—including precise GPS coordinates for where the picture was taken. When you share that JPEG with others, or upload it, you are sharing this metadata too (this is how social media platforms can automatically categorize photos by time and location). If you share photos on a regular basis, you can effectively be sharing a detailed trail of your movements. Law enforcement regularly uses image metadata to locate unwitting criminals. While most major social platforms will strip out the metadata from uploaded photos, occasionally a defect or gap in their site may result in this metadata being accessible.

High-resolution risks. Digital cameras today offer incredible image quality, such that even amateur photos can look fantastic. However, there is a downside to those tack sharp images with many megapixels of detail: they can reveal more information than intended. For example, it is possible to take advantage of this high resolution to identify nearby people who are not in the photo—from their reflections in the eyes of the subjects. You can also pick up other fine details, like text on confidential papers someone is carrying or someone’s fingerprints, even from a distance. The same risk applies to advances in media fidelity in other fields. High-quality audio recordings from sensitive microphones—such as those in intelligent personal assistants—can pick up private conversations not meant to be recorded. You can even analyze audio recordings of keyboard sounds to reconstruct what was typed. As the quality of video and audio sensors and devices increases, these risks will multiply.

Van Eck Phreaking uses specialized equipment to analyze electromagnetic signals to eavesdrop on communications. Anyone that handles highly sensitive information will be familiar with working in rooms with no windows, or inside a Faraday cage where no devices with microphones are allowed, due to techniques that use radio emissions or microphones to “watch” your laptop screen. There are analogous techniques that can use a laser to listen to a conversation from 500 meters away, or exfiltrate data using ultrasonic signals. This type of technology can actually be used for beneficial purposes as well—it’s how the Zoom app detects physical Zoom appliances in the same room—but of course it can be, and is, used by malicious actors to spy on others.

Scraping. When you upload data to an online application, you might expect that it will stay there unless someone who knows you intentionally copies it somewhere else. But in fact there are thousands of groups constantly scraping online data, for a variety of reasons, both good and bad. Google constantly scrapes the web to create its search index, but there are other scrapers who are malicious, and use bots to create attacks so vast in scale that they that would not be possible using manual means. At F5, we regularly see that more than 90% of all login attempts on many major web applications, on a 24/7 basis, come from bots looking to take over accounts. Bots are also used to scrape images from social media platforms, to then create additional fake accounts that look plausible to other users. This is one of the reasons you see so many social media bots with the same profile photos. If you are wondering if your social media image is being used by fake accounts, you can do a reverse image search and investigate.

Cross-correlation risks. Individual bits of benign data can be correlated to expose more about you than you might realize is possible from the pieces themselves. Giving your phone number to a drug store to save some money may seem innocuous, but when that number is looked up on third-party marketing lists, and then combined with leaked lists of voter registrations, it can now be used to identify where you live, how you vote, your health issues, your movements, and whom you communicate with on social media. This profile of you can then also be re-sold over and over. Years ago, AOL released what it thought were a data set of “anonymized” search requests, but by combining bits of personally identifiable information (PII), some users in their dataset were able to be identified.

These are just a few examples in an ever-growing list of threats to your privacy. It’s probably a lot for most people to process. And yet, AI and automation technologies, in the hands of cybercriminals and other malicious actors, make these privacy threats exponentially worse.

As illustrated by scraping, massive amounts of data can be stolen and repurposed at high-speed using sophisticated bots. Similarly, bots can scrape geotagging information from billions of online photos, and AI can rapidly analyze these datasets to detect “interesting” patterns, such as fingerprints or recognized faces. Van Eck Phreaking and similar techniques are traditionally thought to require physical proximity, but now the ubiquity of vulnerable IoT devices makes large-scale invasion of privacy possible at a distance (think of The Dark Knight, when Batman hijacks every mobile phone in Gotham to emit high-frequency bursts to perform radar-like mass surveillance to find the Joker). Our security research teams have found that the entire IPv4 space can be automatically scanned for vulnerable devices in a matter of hours. They also found Internet-connected baby cameras that were compromised through automation and then used to speak to children in their homes. Finally, deep learning systems take cross-correlation privacy risks to a new level, identifying patterns that humans would never find on their own. In short, the use of automation and AI, combined with security issues, creates entirely new categories of privacy threats—which can then be exploited at Internet-scale.

Dealing with all of these threats in the long term is clearly beyond the abilities and energies of most individual consumers. So, what’s the solution? For a comprehensive, societal-level answer, the onus is on governmental policy and regulation, along with platforms, products, and organizations to keep us as safe as possible, using the best available security and privacy technology, which incorporate their own advanced AI and automation capabilities on our behalf. Over time, we are seeing that consumers and governments are holding companies more accountable for doing this effectively, and we’ve already seen actions punishing companies that fail to protect against these threats. Meanwhile, companies like Apple, who are approaching product design from a privacy-first standpoint, are justly praised and rewarded.

While addressing privacy threats in a systemic way is our best long-term defense, at the same time, we should take a number of simple steps ourselves that can go a long way toward protecting our own privacy, especially against today’s most common risks. This is why today, Data Privacy Day, is so important and helpful. It’s a great opportunity for everyone around the world to remind themselves of basic privacy practices which can produce great benefit, without having to be a privacy expert.