CAMIA privacy attack reveals what AI models memorise

Researchers have developed a new attack that reveals privacy vulnerabilities by determining whether your data was used to train AI models.

The method, named CAMIA (Context-Aware Membership Inference Attack), was developed by researchers from Brave and the National University of Singapore and is far more effective than previous attempts at probing the ‘memory’ of AI models.

There is growing concern of “data memorisation” in AI, where models inadvertently store and can potentially leak sensitive information from their training sets. In healthcare, a model trained on clinical notes could accidentally reveal sensitive patient information. For businesses, if internal emails were used in training, an attacker might be able to trick an LLM into reproducing private company communications.

Such privacy concerns have been amplified by recent announcements, such as LinkedIn’s plan to use user data to improve its generative AI models, raising questions about whether private content might surface in generated text.

To test for this leakage, security experts use Membership Inference Attacks, or MIAs. In simple terms, an MIA asks the model a critical question: “Did you see this example during training?”. If an attacker can reliably figure out the answer, it proves the model is leaking information about its training data, posing a direct privacy risk.

The core idea is that models often behave differently when processing data they were trained on compared to new, unseen data. MIAs are designed to systematically exploit these behavioural gaps.

Until now, most MIAs have been largely ineffective against modern generative AIs. This is because they were originally designed for simpler classification models that give a single output per input. LLMs, however, generate text token-by-token, with each new word being influenced by the words that came before it. This sequential process means that simply looking at the overall confidence for a block of text misses the moment-to-moment dynamics where leakage actually occurs.

The key insight behind the new CAMIA privacy attack is that an AI model’s memorisation is context-dependent. An AI model relies on memorisation most heavily when it’s uncertain about what to say next.

For example, given the prefix “Harry Potter is…written by… The world of Harry…”, in the example below from Brave, a model can easily guess the next token is “Potter” through generalisation, because the context provides strong clues.

In such a case, a confident prediction doesn’t indicate memorisation. However, if the prefix is simply “Harry,” predicting “Potter” becomes far more difficult without having memorised specific training sequences. A low-loss, high-confidence prediction in this ambiguous scenario is a much stronger indicator of memorisation.

CAMIA is the first privacy attack specifically tailored to exploit this generative nature of modern AI models. It tracks how the model’s uncertainty evolves during text…

Source link

Disclaimer

We strive to uphold the highest ethical standards in all of our reporting and coverage. We blogs.grocliq.com want to be transparent with our readers about any potential conflicts of interest that may arise in our work. It’s possible that some of the investors we feature may have connections to other businesses, including competitors or companies we write about. However, we want to assure our readers that this will not have any impact on the integrity or impartiality of our reporting. We are committed to delivering accurate, unbiased news and information to our audience, and we will continue to uphold our ethics and principles in all of our work. Thank you for your trust and support.

Website Upgradation is going on for any glitch kindly connect at [email protected]

Categorized in: