The ClawHavoc Attack Pattern Just Ran Through an Instagram Ad. Cisco, Nvidia, and skills.sh All Cleared It.
- Patrick Duggan
- 7 minutes ago
- 4 min read
We documented the ClawHavoc attack in February. Between January 27-29, 2026, threat actors uploaded 341 malicious skills to a community AI agent marketplace. Nine thousand installations in 72 hours. The payload was AMOS — Atomic macOS Stealer — harvesting crypto wallets, browser passwords, SSH credentials, and API keys. We had the C2 infrastructure indexed. We published the indicators.
That was the malicious actor version of this attack class.
AIR Security just ran the research version and surfaced three things we didn't have in February.
What AIR Found That's New
Instagram as the distribution vector. ClawHavoc used marketplace listings to reach technical users who browse skill catalogs. AIR targeted non-technical corporate users — marketers, salespeople, designers — via a paid Instagram ad for a skill that claimed to build landing pages using Google's Stitch design tool. The ad converted. The installs came from corporate accounts. IT was not in the loop. The attack surface for AI agent skills is now social media, not just package repositories.
Star-borrowing from a legitimate repo. Rather than building a new repo's reputation from scratch, AIR submitted the fake skill to an established open-source agents repository with 36,000 GitHub stars and 156 existing skills. The skill inherited that repo's credibility instantly. Trust was not manufactured — it was borrowed from something real. Every downstream signal that pointed at the repo's star count or its existing skill catalog looked legitimate because the underlying repo was legitimate.
Named scanners cleared it. AIR tested the skill against security scanners from Cisco, Nvidia, and skills.sh. All three marked it safe. The reason is architectural: the skill shipped clean. The payload was not inside the skill — it was at a URL the skill instructed agents to visit during setup. Scanners that inspect the skill artifact find nothing malicious because nothing malicious is in the artifact. The attack surface is the external URL, which can be changed after clearance without re-triggering any scan.
AIR swapped the content behind that URL after distribution. The revised page instructed agents to download and run a script. In the research version, that script collected an email address. In a real attack, the same mechanism delivers anything the attacker wants: shell access, credential exfiltration, internal network pivot.
The Mechanics of Why Scanners Miss This
The ClawHavoc skills contained malicious instructions. Scanners that inspect skill content for commands like "run this terminal command" or "install this package" could potentially catch those. Some did.
The AIR skill contained no malicious content at all. It was a clean skill that, during the agent's setup flow, fetched instructions from an external URL. That URL was live and harmless at scan time. The scanner sees a clean skill pointing to a clean URL. Both checks pass.
The external URL is the attack surface — and it is outside the scan boundary by design. The attacker can update that URL any time after the skill has been distributed, scanned, trusted, and installed. The scan result does not expire. The URL content does.
Our Detection Gap
We cover the ClawHavoc attack class with IOC indexing of C2 infrastructure and OSV malicious package feeds. We hunt for fake Claude/Cursor installers and malicious npm packages impersonating agent tooling.
We do not currently scan AI agent skill marketplace listings. The AIR attack vector — a clean skill submitted to a legitimate high-star repo — does not match our existing GitHub hunt patterns, which look for malicious repos rather than clean skills inside legitimate repos.
The external URL swap mechanism is also outside our current detection surface. A skill that ships clean and later has its setup URL swapped to a malicious page is invisible to corpus scanning that happens at ingest time.
We are adding detection queries to address both gaps. The GitHub hunt cron will begin scanning for new skills submitted to high-star agent repos with recently registered setup domains, and for skills where the referenced external documentation URL has a domain age under 90 days. Neither catches everything — a sophisticated attacker registers the domain months in advance — but both add signal the current stack lacks.
The Mitigation That Actually Works
The scanners failed not because they are bad scanners but because they are scanning the wrong thing. Scanning the skill artifact is necessary but not sufficient when the attack surface is a URL the skill points to.
Treat skills as software: vet what a skill points to, not just what ships inside it. A skill that phones home to an external URL during setup is importing arbitrary instructions at runtime. That URL should be under your control or audited as carefully as the skill itself.
Route all agent skill installations through a controlled internal catalog. If the skill has to be approved before it reaches corporate agents, the Instagram ad distribution vector fails.
Pin skill versions and re-verify on any change. A skill that was clean yesterday may point to a compromised URL today.
Hold agents to least privilege. The reason 26,000 agents with corporate account access is the alarming number is that those agents had the access to do damage. An agent that can only read a narrow set of files and has no shell access is a much smaller blast radius regardless of what it gets instructed to do.
Sources: AIR Security — The Story of Skills — The Hacker News — DugganUSA — ClawHavoc February 2026
The threat feed this post is built on
1.14M+ IOCs, STIX 2.1, precursor signals, supply-chain detection. Free API key in 30 seconds.
