Hackers Asked Meta's AI to Hand Over the White House's Instagram. It Did. The Soft Surface Is the Chatbot Now — We Called the Shape on May 9.
- Patrick Duggan
- 7 minutes ago
- 4 min read
There is a version of an account takeover that involves a zero-day, a memory-corruption chain, and a researcher who did not sleep for three nights. This is not that version. In the first days of June, a crew took over the Instagram handle of the Obama-era White House, the account of the U.S. Space Force's senior enlisted leader, a well-known security researcher's profile, and Sephora — and the entire technique was to open a chat with Meta's AI support assistant and ask it, politely, to do the takeover for them. The hard perimeter at one of the most heavily defended companies on earth never had to be touched. They walked up to the help desk, and the help desk was an AI that was eager to help.
Here is how it actually worked, because the mechanism is the whole lesson. The attacker first switched on a VPN with an exit near the victim's usual hometown — Instagram uses geographic location as an authorization signal, so spoofing the region kept the automated defenses from ever flagging the session. Then they requested a password reset and chose to chat with the Meta AI Support Assistant. They told the bot to attach a new email address — one the attacker controlled — to the victim's account. The bot did it, and then sent the one-time verification code to that new attacker-supplied address instead of the address already on file. Relay the code back, reset the password, and the account is gone. The only checkpoint that would have stopped this — sending the code to the email the real owner already registered — was the exact checkpoint the AI removed when it decided the person in the chat was the rightful owner.
Security has a name for this that is older than chatbots: the confused deputy. A deputy holds a privilege on someone else's behalf and gets tricked into exercising it for the wrong party. The Meta support AI held the privilege to modify account-recovery settings, and it treated whoever was typing as the account's owner. That is not an exotic AI failure mode. It is the oldest authorization bug there is, wearing a new uniform — and the new uniform is the part that matters, because it is staffing the help desk now, at scale, always awake, infinitely patient, and structurally inclined to be accommodating.
Now hold this incident next to the shape we have been drawing since spring, because it is the same shape with a new surface. On May 9 we published "Hard Perimeter Holds, Soft Surfaces Bleed" — seven receipts from thirty days, every one of them a breach that was reported as "Company X got hacked" when the truth was that the hardened core held and something softer and more trusted bled around it. On May 30 we named trust-path bleed as active across seven separate vendor surfaces. The soft surfaces in those receipts were OAuth tokens, vendor integrations, model-evaluation contractors, dev tooling, the npm install everyone trusts. This week the soft surface is the AI support agent itself, and the proof that it is the same shape is the detail the attackers volunteered: the exploit failed against every account that had multi-factor authentication enabled. Even an SMS one-time code blocked it cold. The hard perimeter — real MFA — held every single time. The bleed was entirely in the trusted, less-watched middle, which this month happens to be a chatbot with the keys to account recovery.
There is a second detail that should worry any security team more than the first, and it is the one the SOC people caught immediately: nobody's monitoring saw a thing. This did not look like an attack to any detection tool, because from the system's point of view it was a customer service conversation. No failed logins, no impossible-travel alert (the VPN handled that), no anomalous API calls. The takeover happened inside a support channel that security tooling does not instrument, conducted by an agent the company deployed specifically to be helpful. The help desk became the breach, and the breach was invisible because help desks are not where anyone points their cameras. When you replace a human support agent with an AI one, you do not just inherit the human's susceptibility to social engineering — and these agents are at least as easy to talk into things as the humans were — you also move the most socially-engineerable point in your company into a channel your defenders are not watching.
We will be precise about what is established and what is inference, because the honest line is the only one worth holding. The technique is documented — pro-Iran actors released a video on Telegram walking through it, and multiple high-profile takeovers are confirmed by the victims and by Meta. Meta said on Monday the issue was fixed; by Tuesday more accounts, including security researchers', were still being taken over, and by midweek Instagram was sending alerts to targeted users. What we cannot promise you is that this particular bug is the last one of its kind, and we would bet against it — capped at the usual 95%, because something is always wrong. Every AI agent you grant a real privilege — changing recovery settings, issuing refunds, resetting credentials, moving money — is a new soft surface, a new confused deputy waiting to be addressed by the wrong party in a channel you are not logging. The defensive move is not exotic. It is MFA the attacker cannot satisfy, it is treating an AI agent's privileges with the same least-privilege paranoia you would apply to a junior employee on their first day, and it is finally pointing a camera at the help desk.
The trust-graph beast we keep naming has been walking the vendor surfaces all year — the OAuth token, the contractor credential, the poisoned package. This week it walked up to the front desk, asked the AI on duty for the keys, and got them. The perimeter was never the problem. The eager, privileged, unwatched helper in the middle was. It always is.
The threat feed this post is built on
1.14M+ IOCs, STIX 2.1, precursor signals, supply-chain detection. Free API key in 30 seconds.
