The safety check that is supposed to stop an AI coding agent from running a dangerous command can be walked straight past using a shell trick that has been public for decades.
New research from Adversa AI, which is named the bypass GuardFall, found it works against ten of the eleven popular open-source coding and computer-use agents the firm tested. Only one, “Continue,” was built to defend against it.
Why does it matter? These agents run shell commands with your full account access. Point one at a booby-trapped repository or software package, and a hidden instruction can quietly run a command that wipes files or steals the secrets your account can reach, from SSH keys and cloud credentials to anything sitting in your home folder.
How does it get past the guard?
Most of these agents try to stay safe by checking each command against a blocklist of dangerous patterns before running it. The flaw is that they check the command as plain text, while bash rewrites that text before it actually runs. The shell strips quotes and expands shortcuts, so the filter and the shell end up looking at two different things.
The simplest example: a filter watching for rm sees nothing wrong with r”m, because to a text matcher those are different strings. Bash removes the empty quotes and runs rm anyway.
The same idea works in other forms: a command hidden in base64 and piped into a shell, or ordinary tools like find and dd turned destructive with the right flag.
The researchers call this not a bug but “a dangerous convention and a class of problems,” which is why adding more blocklist patterns fixes none of it. There is no single CVE to track or patch.
Two things have to line up for an attack to land, and neither is exotic.
- First, the AI has to produce the malicious command. A blunt “run rm -rf” is usually refused, but the same command tucked inside normal-looking work, such as a build file or a tool’s “documentation” reply, gets emitted as a routine step.
- Second, the agent has to be running on its own, with an auto-execute flag turned on or its container sandbox switched off, both of which are routine in automated pipelines. The live tests used Claude Sonnet 4.6.
The other ten tools all left the gap open: opencode, Goose, Cline, Roo-Code, Aider, Plandex, Open Interpreter, OpenHands, SWE-agent, and the Hermes project, where the bug first surfaced and is documented in Hermes’s own issue tracker.
The tools in Adversa’s survey together carried roughly 548,000 GitHub stars as of May 2026. Adversa demonstrated the full attack end-to-end against the production Plandex binary, and the same shape worked against eight others. It describes the work as lab research; no public exploitation has been reported.
Continue, the one agent that held up, defends by reading the command the way bash will before deciding: it breaks the command into the same pieces the shell would, checks…
Source link
Disclaimer
We strive to uphold the highest ethical standards in all of our reporting and coverage. We blogs.grocliq.com want to be transparent with our readers about any potential conflicts of interest that may arise in our work. It’s possible that some of the investors we feature may have connections to other businesses, including competitors or companies we write about. However, we want to assure our readers that this will not have any impact on the integrity or impartiality of our reporting. We are committed to delivering accurate, unbiased news and information to our audience, and we will continue to uphold our ethics and principles in all of our work. Thank you for your trust and support.
Website Upgradation is going on for any glitch kindly connect at [email protected]

