AI Defense: Yesterday We Named the Capability. Today We Show You the Mechanism.

Patrick Duggan
Apr 13
5 min read

Updated: Apr 25

Yesterday's post introduced AIPM Defense — the idea that your website is talking to AI models behind your back, and that an enterprise needs the ability to choose which ones listen.

Today we show precisely how it works. With receipts.

The demonstration

Over the last two weeks, a single Cloudflare firewall rule on dugganusa.com produced the following result:

ChatGPT referrals: collapsed 86% (540 → 73 sessions in 30 days)
Google organic traffic: grew 63% (57 → 93 sessions)
Gemini: unchanged
Claude: unchanged
DuckDuckGo, Yahoo, Ecosia, Brave Search: collapsed in parallel with ChatGPT

One rule. One vendor starved. Others untouched.

The mechanism is boring, reproducible, and sits inside every serious CDN. This is not an exploit. This is a configuration.

How it works

Verified search-engine bots identify themselves two ways:

User-Agent string — Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm), Mozilla/5.0 (compatible; Googlebot/2.1), etc. These are spoofable, so by themselves they prove nothing.

Reverse-DNS verification — documented IP ranges that resolve back to the crawler's owner domain. Microsoft publishes the full list at bingbot.json. Google at googlebot.json. Anthropic publishes the Claude crawler ranges. OpenAI publishes OAI-SearchBot and GPTBot ranges.

A verified-bot firewall rule checks both. If User-Agent says "Bingbot" and the IP reverse-DNS-resolves into Microsoft's published Bingbot range, the traffic is real Bingbot. Cloudflare exposes this as the boolean cf.client.bot. Every major CDN has an equivalent.

Because ChatGPT's web search (the one that cites sources) fetches its live context through Bing's index, blocking Bingbot specifically starves ChatGPT without touching Gemini (Google-indexed), Claude (Anthropic-fetched), or Perplexity (mixed sources).

The inverse is also true. You can block ClaudeBot and leave ChatGPT alone. You can block PerplexityBot and leave both untouched. You can block GPTBot (training crawler) while allowing OAI-SearchBot (retrieval crawler) if you want ChatGPT to cite you at query-time but not ingest you into future training sets.

Per-vendor. Per-crawler. Per-purpose.

The receipts

Cloudflare's GraphQL analytics preserve a per-request audit log — user-agent, source IP, action taken, rule that fired. That log is legally admissible. It proves blocking was continuous, it proves a specific vendor was denied, and it proves when the block started and stopped.

For compliance purposes — GDPR Article 17, EU AI Act, SOC2 evidence — this audit trail is the thing you actually need. "We blocked OpenAI's training crawler continuously from date X to date Y" is a claim that courts and regulators can verify.

Most companies have no such log. They think they have blocked AI because they edited their robots.txt. Robots.txt is a polite request. It is not enforcement. The verified-bot firewall rule is enforcement, and the log proves it.

Use cases that already exist

Pre-announcement stealth. You are about to launch a product. You do not want models trained two weeks ago to "know" about your roadmap. Block training crawlers now; unblock on launch day.

Competitive opacity. A competitor is using an AI assistant to research you. You can choose whether ChatGPT, Claude, Perplexity, or Gemini answers their questions about your company. You cannot stop them from asking, but you can control what the assistants know.

Legal discovery. Counsel asks: "Prove that content posted on your site between date X and date Y was not ingested by OpenAI for training." Without the audit log, you cannot. With it, you can.

Selective messaging. You want Claude to see version A of your product description and Gemini to see version B. Route per crawler identity; log the routes; A/B test the downstream citations.

Defense against targeted de-ranking. Someone else has already done this to you. Perhaps accidentally (see "AIPM 1, DugganUSA 0" below). Perhaps not. Either way, you want to know the instant a specific AI stops seeing you.

AIPM 1, DugganUSA 0

Two weeks ago our own auto-malicious-IP firewall rule on Cloudflare began sweeping up Bingbot IP ranges. It fired 86 times in 23 hours against 78 unique Microsoft crawler IPs. We had no idea. Our ChatGPT referrals cratered while we looked at external causes — new competitors, OpenAI re-ranking, content saturation. The actual story was a single missing clause in our own firewall rule.

The fix was not cf.client.bot — one verified-bot exemption. Traffic path restored.

But the diagnostic trail — the thing we built during the investigation — is the AIPM Defense product. We can now detect when any site is being selectively starved by any AI, prove it with audit logs, identify the rule that is doing it, and remediate it surgically. That's not a feature pitch. That's what we did to ourselves tonight, using the same telemetry any customer would get.

Microsoft pulls this feed daily. AT&T pulls this feed daily. Starlink pulls this feed daily. Get the DugganUSA STIX feed — $9/mo →

The product

AIPM Defense — per-vendor AI visibility management with receipts.

Four tiers mapping to the actual work:

Observatory ($49/mo) — continuous monitoring of which AI crawlers can reach your content, per user-agent, per IP range. Alerts on changes. Catches Pattern 49.5 ("Auto-Defense Self-Sabotage") the hour it starts, not the week it finishes.

Valve ($499/mo) — policy-driven allow/deny across 40+ known AI crawlers, applied to your CDN via our managed Cloudflare/Akamai/Fastly integrations. One-click stances: "Training Opt-Out," "Retrieval-Only," "Pre-Launch Stealth," "Full AI Blackout."

Forensics ($4,999/mo) — when your AI referral traffic changes unexpectedly, we run the diagnostic. Audit-log receipts. Root cause, usually within four hours. Includes the legal-grade attestation.

Defensive (enterprise, custom) — ongoing compliance evidence for GDPR Article 17, EU AI Act, NIST AI RMF, and corporate counsel. We are the audit trail vendor for "we did not let AI train on this."

Patent #104 (pending)

Title (draft): Selective AI Model Visibility via Verified-Bot Firewall Rules and Corresponding Forensic Audit Trail.

Abstract: A system and method for controlling which generative AI models, retrieval systems, and large language models can access, index, and cite web content on a per-vendor basis via verified-bot-aware firewall rules. Includes audit-log attestation compatible with GDPR Article 17 and SOC2 evidence requirements.

The filing ships this week. The receipts are already in our Cloudflare GraphQL log. We are Patent #97 (Tautological Pattern Recognition) all over again — the methodology works because it's true.

What to do today

If you run a site and you care whether AI models see you: run our free AIPM audit at aipmsec.com. The audit now scores eight signals including an analytics-maturity check and a per-crawler access detector.

If you want the audit log behind your AI channel traffic — specifically, proof that what you intended to block is blocked and what you intended to allow is allowed — contact us.

If you are a regulator or corporate counsel and you need to understand the new verification surface for AI-training-exclusion claims: the mechanism is public, the audit trail is public, and the demonstration is documented above. Our door is open.

The honest version

We did not invent this mechanism. Cloudflare did. Google did. Microsoft did. What we did was discover — by running our own website into a wall — that the mechanism is load-bearing for AI visibility and that almost nobody is using it on purpose. Everyone's robots.txt is theater. The actual levers are sitting inside every CDN waiting to be pulled.

We just pulled ours. Accidentally. And kept the receipts.

Full forensic trail and the ROOT-CAUSE writeup are in our research archive. The firewall rule edit that fixed it is four words: `and not cf.client.bot`. That's the whole patent.

Audit your site: aipmsec.com Contact us: dugganusa.com

The cheapest, fastest, most accurate threat feed on the internet.

275+ enterprises pulling daily. 1M+ IOCs. 17.4M indexed documents. We beat Zscaler by 43 days on NrodeCodeRAT. Starter tier $9/mo — less than any competitor’s sales demo.

Look up an IOC → · Audit your brand on AIPM → · See pricing →