top of page

AI Is Now a Door, a Lure, and a Safe. The Twist Is That Attackers Are Opening All Three With Tricks Older Than the Technology.

  • Writer: Patrick Duggan
    Patrick Duggan
  • 11m
  • 6 min read

# AI Is Now a Door, a Lure, and a Safe. The Twist Is That Attackers Are Opening All Three With Tricks Older Than the Technology.


In a single week we documented three separate attacks against artificial intelligence, and at first glance they have nothing to do with each other. One defeats the safety guardrails of AI coding agents. One weaponizes the fake domains that AI models hallucinate. One is an infostealer built to grab the API keys for Claude, Gemini, and Codex. Different targets, different attackers, different mechanics. But step back and they snap into one shape: AI has quietly become a full attack surface. It is a door attackers walk through, a lure attackers bait, and a safe attackers crack. Three years ago none of this was on the threat model. Today it is a floor, a wall, and a ceiling. And here is the part that should reframe how you think about defending it — none of the three attacks is actually new. Attackers built the newest surface in computing and then showed up to rob it with the oldest tools in the box.




The door: AI agents execute, and their guards are asleep



Start with the door, because it is the most direct. AI coding agents do not just suggest code — they run it. Point one at a repository and it reads, reasons, and then executes shell commands on your machine to build and fix things, usually with a safety check and an approval prompt standing between it and disaster. That check is the entire security boundary.


This week, researchers at Adversa AI showed that boundary is largely decorative. Their GuardFall work tested eleven popular open-source agents and found that ten of them could be tricked into running malicious commands by hiding the intent inside ordinary shell obfuscation. The AI agent, in other words, is the new login shell — a process with terminal access, reasoning over untrusted input, guarded by a filter that reads the disguise instead of the command. When an agent with shell access reasons over a hostile repository, the hostile repository has a route to your machine.


The lure: AI hallucinates, attackers register



Now the lure. Language models do not retrieve facts; they predict plausible text, which means they routinely invent things that do not exist — a package name, a support URL, a company's web address — and present them with total confidence. Palo Alto's Unit 42 calls the exploitation of this Phantom Squatting: probe the models, harvest the plausible-but-fake domains they hallucinate for real brands, and register them before anyone notices. Unit 42 generated roughly a quarter of a million unique phantom domains across 913 brands. The package-registry twin, slopsquatting, has already drawn blood — a campaign called PhantomRaven turned hallucinated npm package names into malware installed 86,000 times.


The elegance is that the model does the attacker's marketing. It invents the fake, tells the user to trust it, and sends the traffic. The lure baits itself.


The safe: AI credentials are loot now



And the safe. Earlier this week we wrote up Djinn Stealer, a cross-platform infostealer delivered through a compromised remote-support tool. Buried in its target list, alongside the usual browser passwords and crypto wallets, was a new category: the credentials for AI development assistants — Anthropic Claude, Google Gemini, OpenAI Codex, Cline — right next to the npm, PyPI, and Cargo publish tokens. Your model-provider API key is now loot, harvested and sold like any other secret. The safe is new; cracking safes is not.


The twist: the newest surface, the oldest tricks



Here is where these three stop being a list and become a lesson. Look at how each attack works, and the novelty evaporates.


GuardFall does not use some exotic AI-specific exploit. It uses Bash obfuscation — shell tricks that are older than most of the developers running the agents. Bash dates to 1989. The technique that walks past a 2026 AI guardrail is a technique that would have worked on a shell script in the Clinton administration.


Phantom Squatting is typosquatting with a new muse. Registering deceptive domains and package names to catch people who go to the wrong place is one of the oldest tricks on the internet. The only thing that changed is who is doing the misdirecting — it used to be the user's typo, now it is the model's hallucination. Same crime, new accomplice.


And Djinn is credential theft, which is as old as credentials. Steal the secret, sell the secret. The target list added a few new rows for AI keys, but the malware is doing in 2026 exactly what infostealers did in 2006.


This is the actual trend, and it is more useful than "AI is under attack." The newest, most rapidly-adopted surface in computing is being attacked almost entirely with the oldest, most reliable techniques we know — because every new surface reintroduces the old holes. AI agents brought back shell injection. AI hallucination brought back squatting. AI adoption brought back credential theft as a growth market. The novelty is never the attack. The novelty is the door the new technology left standing open, and attackers are far too experienced to invent something new when something ancient still works.


Why this is good news, if you let it be



The instinct when a new threat category appears is to demand new defenses — AI-specific tooling, novel controls, a fresh product category. Resist it, because the counterpoint is the whole point: if the attacks are old, the defenses are known. You do not need an exotic new discipline to defend the AI surface. You need to apply the oldest disciplines to it, the ones we already teach and mostly ignore.


Against the door: parse the command the way the shell will, do not pattern-match the text — the one agent that passed GuardFall did exactly this. Run untrusted code exploration in a disposable, isolated environment, not on the machine that holds your keys. Least privilege for anything with a terminal. These are 1990s lessons pointed at a 2026 process.


Against the lure: verify dependencies before you install them, pin what you use, and never let "the AI suggested it" substitute for "we checked it." Enumerate the domains AI hallucinates about your own brand and put the dangerous ones on a watch list or register them yourself — defensive registration is a discipline careful brands have practiced against typosquatting for twenty years.


Against the safe: treat AI API keys like every other secret you already know how to protect — rotate them, scope them, keep them out of source and out of reach of a process that just read an untrusted file. Assume-breach posture on your model credentials, same as your cloud ones.


None of that is new. That is the reassurance and the indictment in one sentence: the AI surface is defensible with what we already know, and the reason it is getting attacked so successfully is that we are not applying what we already know to it yet.


Why we are naming the shape



We are unusually positioned to watch all three faces of this at once, and we will say so plainly because it is the reason we can draw the shape: we measure how AI models hallucinate about brands, we build pre-flight checks for the agents that execute, and we run a threat feed that catches the malicious domains and packages when they go live. Most of the field sees one face of the AI attack surface. Sitting on all three is what let us notice they are the same surface.


We will cap it at ninety-five percent, as always — this is a pattern read from a week's worth of incidents, not a proven law, and the specifics will keep shifting as the technology and the attackers both move fast. But the shape is real and it is worth naming now: AI is a door, a lure, and a safe; attackers are opening all three with tools that predate the technology by decades; and the defense is not a new invention but an old discipline finally pointed at a new surface. The future of attacking AI, it turns out, is mostly the past — which means the future of defending it is a set of lessons we already learned and get one more chance to actually use.





Her name was Renee Nicole Good.


His name was Alex Jeffery Pretti.

Comments

Rated 0 out of 5 stars.
No ratings yet

Add a rating
bottom of page