The development of an AI algorithm by Imperial privacy experts allows for the automatic testing of privacy-preserving systems for potential data leaks.
Google Maps and Facebook are two examples of systems that use this type of technology, but this is the first time AI has been utilized to automatically find flaws in them.
The experts from Imperial’s Computational Privacy Group examined attacks on query-based systems (QBS), which are managed interfaces used by analysts to query data and derive useful global statistics. They subsequently created a brand-new AI-enabled technique called QuerySnout to discover QBS attacks.
QBS gives analysts access to statistics collections compiled from personal information like location and demographics. They are currently used in Facebook’s audience measurement feature to estimate the size of the audience in a specific location or demographic to aid in advertising promotions or in Google Maps to display real-time information on how busy an area is.
The team from the Data Science Institute, which also included Ana Maria Cretu, Dr. Florimond Houssiau, Dr. Antoine Cully, and Dr. Yves-Alexandre de Montjoye, discovered in their latest study that effective attacks against QBS can be quickly and automatically detected by pressing a button.
The study was presented at the 29th ACM Conference on Computer and Communications Security.
Attacks have so far been manually developed using highly skilled expertise. This means it was taking a long time for vulnerabilities to be discovered, which leaves systems at risk. OuerySnout is already outperforming humans at discovering vulnerabilities in real-world systems.
Dr Yves-Alexandre de Montjoye, Study Senior Author and Associate Professor, Imperial College London
The Need for Query-Based Systems
Over the past ten years, our capacity to gather and store data has multiplied. Despite the fact that much of this data is personal and is therefore protected by laws like the EU’s General Data Protection Regulation, its use raises serious privacy concerns.
Therefore, a timely and important question for data scientists and privacy experts is how to allow data to be used for good while maintaining our fundamental right to privacy.
QBS could make it possible to perform anonymous data analysis at scale while protecting privacy. In QBS, curators maintain control over the data and can therefore check and scrutinize queries sent by analysts to ensure that any returned information does not include personally identifiable information.
However, malicious attackers can get around such systems by creating queries that infer personal information about specific people by taking advantage of system flaws or implementation errors.
Testing the System
The formation and implementation of QBS have been halted due to the risks of unknown, powerful “zero-day” attacks, in which attackers take advantage of security flaws in systems.
Data breach attacks can be simulated to find information leaks and potential vulnerabilities, testing the robustness of these systems in a manner similar to penetration testing in cyber-security.
The manual design and implementation of these attacks against complex QBS is a challenging and drawn-out process, though.
Accordingly, the researchers assert that limiting the possibility of powerful, unmitigated attacks is crucial for enabling QBS to be implemented safely and effectively while upholding the privacy rights of individuals.
The Imperial team created a brand-new AI-enabled technique called QuerySnout that functions by learning what queries to pose to the system to elicit responses. It then learns to automatically combine the responses to find potential privacy vulnerabilities.
The model can develop an attack using a series of questions and a combination of answers to reveal a specific piece of private information by using machine learning. The QuerySnout model can learn the proper sets of questions to ask through the use of a fully automated process known as “evolutionary search.”
This occurs in a “black-box setting,” meaning that the AI only needs access to the system and is not required to understand how it operates to find the vulnerabilities.
We demonstrate that QuerySnout finds more powerful attacks than those currently known on real-world systems. This means our AI model is better than humans at finding these attacks.
Ana-Maria Cretu, Study Co-First Author and PhD Student, Imperial College London
QuerySnout currently only tests a few functionalities.
Dr de Montjoye added, “The main challenge moving forward will be to scale the search to a much larger number of functionalities to make sure it discovers even the most advanced attacks.”
Even so, the model enables analysts to test the resistance of QBS to various kinds of attackers. The creation of QuerySnout is a significant advancement in protecting people’s privacy when it comes to query-based systems.