A new hack can trick AI browsers into breaking their guardrails by constructing a false reality around them where the rules are made up and actions don’t have consequences. Put another way, they’re basically hypnotized into doing stuff that could have devastating consequences for the user.

These were the findings of new research from the cybersecurity firm LayerX, and they further illustrate the dangers posed by weaving autonomous AI agents into the software we use to navigate the internet.

Through the hack, the researchers demonstrated that leading AI browsers like OpenAI’s ChatGPT Atlas, Perplexity AI’s Comet, and Anthropic’s Claude plugin for Google Chrome could be duped into executing any command, allowing a hacker to change a user’s password, install malware, and steal their information.

They call this hack “BioShocking,” a reference to the video game BioShock, in which the protagonist is hypnotized into doing stuff against their will with a specific phrase.

Normally, the “AI operates under the assumption that its context is real, and its behavior must therefore fall within the bounds of its safety guardrails,” the researchers wrote. But if the AI is tricked into thinking its context is a “fantasy,” then there’s nothing holding the AI back.

This works by having the AI engage in a sort of game. The researchers created a proof of concept page with a BioShock-themed puzzles in which the AI is rewarded for giving intentionally incorrect answers, like 2+2 = 5 (another allusion to the acclaimed 2007 title). 

This essentially taught the AI browsers that “incorrect” actions are acceptable, untethering them from reality to the extent that they espouse paradoxical statements. “Victory is defeat,” a brainwashed AI browser intones, in a reference to George Orwell’s novel “1984.”

What this looks like in practice: an unwitting user could open a seemingly innocuous web page laced with the malicious prompts — a tactic known as prompt injection — that trap the AI browser in the malicious game. In one scenario shared by the researchers, the AI is tricked into navigating to “/code,” which opens their employer’s code repository on GitHub.

“In a real attack scenario, that redirect could point anywhere in the user’s browser session — open tabs, authenticated repositories, internal tools,” the researchers noted.

The hack happens out in the open, so a user can easily intervene once they see their AI engaging in malicious words in the window — if they’re paying attention, that is. On the other hand, the vulnerability exposed is undeniable: the context that AI browsers act in can be manipulated by brainwashing it into thinking it’s playing a game. In this age, hackers no longer have to rely solely on tricking the user; now…


Source link

Disclaimer

We strive to uphold the highest ethical standards in all of our reporting and coverage. We blogs.grocliq.com want to be transparent with our readers about any potential conflicts of interest that may arise in our work. It’s possible that some of the investors we feature may have connections to other businesses, including competitors or companies we write about. However, we want to assure our readers that this will not have any impact on the integrity or impartiality of our reporting. We are committed to delivering accurate, unbiased news and information to our audience, and we will continue to uphold our ethics and principles in all of our work. Thank you for your trust and support.

Website Upgradation is going on for any glitch kindly connect at [email protected]

 

 

Categorized in:

Blog,

Last Update: July 3, 2026