Your 403 Logs Are a Customer List and a Threat Roster. Here's How to Read Them.

Patrick Duggan
Apr 12
5 min read

Updated: Apr 25

# Your 403 Logs Are a Customer List and a Threat Roster. Here's How to Read Them.

Every API endpoint with authentication has a reject pile. Requests that came in without a key, with a bad key, or with a key that does not have the right tier. The standard practice is to return a 403, log the event, and move on.

We stopped moving on. We started reading them. What we found in a single weekend changed how we think about our own traffic.

This post describes a technique. It is not complicated. Any organization running an authenticated API can do it tonight with the logs they already have. Nobody else appears to be publishing this as a formal methodology, so we are.

The technique

Group your 403 (and 401) responses by three dimensions: IP hash (or raw IP if you have it), user agent, and target path. Sort by frequency. Look for patterns.

What you are looking for:

One. Single-path obsession. An IP that hits the same path hundreds or thousands of times without trying any other endpoint is not a user who forgot their password. It is either an automated collection tool or a reconnaissance probe. The path they chose tells you what they want.

Two. User agent fingerprints. The exact version string of an HTTP library is a fingerprint. If seven IPs across four countries are all running python-requests/2.32.5 against your search API, that is not seven independent researchers. That is one tool deployed in multiple locations. The version number is the tell.

Three. Country convergence. Multiple distinct IPs from the same country hitting the same auth-gated endpoint within a short time window is a signal. One actor is a curiosity. Two is a coincidence. Three is a campaign. We built an automated precursor detection signal around this pattern after discovering it empirically.

Four. User agent spoofing. An IP presenting an iPhone user agent string (iOS 14.4) alongside an okhttp/3.14.9 header is lying. okhttp is an Android/Java library. It does not appear in any legitimate iOS browser. The lie tells you the operator is aware of fingerprinting and is actively trying to evade it — which is itself a classification.

Five. Behavioral response to disclosure. If you publish an investigation naming an actor and they stop polling within minutes, that is not a coincidence. That is an operator reading your API response body in real time. Innocent scripts do not respond to content changes in the response payload.

What we found

We applied this technique to our own STIX/TAXII threat intelligence feed logs over a weekend. We identified seven distinct actor models operating against our authenticated endpoints.

One was a 65-day persistent poller from AT&T Wireless infrastructure near Kennedy Space Center, using an axios-based script with a collection name matching a GitHub username belonging to an Alibaba Group developer in Beijing. That actor went dark within minutes of us putting a blog post link in the 410 response. We published the investigation. We filed with Rewards for Justice.

One was a Chinese actor with a deliberately stripped user agent harvesting our IOC CSV exports 362 times in 24 hours. A different Chinese IP had hit the same endpoint two days earlier — different hash, different user agent, same target. That is IP rotation by a single operator.

One was a Hong Kong cloud provider — UCloud, AS135377 — with two IPs spoofing iPhone user agents while probing our API structure. GreyNoise's 2026 State of the Edge report documented this ASN as responsible for 392 million malicious sessions (14% of all observed malicious traffic globally) with 38% of its IPs classified as malicious.

One was a French Kubernetes scanning cluster on a brand-new ASN (AS211590, registered March 2025) using curl to enumerate .env and credential file paths across our API surface. Anonymous Tutanota abuse contact. Bulletproof hosting indicators.

One was a cluster of seven IPs across the United States, Canada, United Kingdom, and Australia — all running python-requests/2.32.5 against our search endpoint. The identical library version across four Five Eyes countries is the fingerprint of a distributed tool. Further analysis identified the pattern as consistent with OpenCTI TAXII connector workers — legitimate cyber threat intelligence platforms consuming our feed through their default polling configuration. The 60-minute polling interval on the Australian node matched OpenCTI's default TAXII connector setting exactly.

That last one is the point of this post. The same technique that identified threat actors also identified legitimate CTI practitioners who had not registered for an API key. Those are not threats. Those are customers who do not know they are customers yet.

The dual read

Your 403 logs contain two populations that look identical at the HTTP level: people trying to steal your data and people trying to use your data. The technique described above separates them.

Threat actors spoof user agents, strip identifying headers, rotate IPs, and probe paths they should not know about. CTI practitioners use standard library versions, poll at regular intervals, target documented endpoints, and operate from cloud infrastructure in jurisdictions consistent with their organizational mandate.

Microsoft pulls this feed daily. AT&T pulls this feed daily. Starlink pulls this feed daily. Get the DugganUSA STIX feed — $9/mo →

The 403 is the same. The intent is opposite. The fingerprints distinguish them.

What to do with it

For the threat actors: block, document, publish, report. We blocked the ASNs. We documented the behavioral profiles. We published the investigations. We filed with the appropriate authorities. The investigations are ongoing.

For the CTI practitioners: reach out. They are already consuming your product — they just have not registered. A 403 that includes a registration URL is a sales pitch that costs nothing. We added registration links to our error responses. Several of the OpenCTI nodes have since registered for free API keys.

For the technique itself: automate it. We built a precursor detection signal — Signal Number 6, internally called "Princess and the Pea" — that fires automatically when multiple distinct IPs from the same country converge on auth-gated threat intelligence endpoints within a 72-hour window. The signal that took us a weekend to discover manually now generates an email alert within minutes.

The meta-point

Threat intelligence is not just the data you publish. It is the data your infrastructure generates by existing. Every 403 is a question someone asked. The question itself is intelligence — about your adversaries, your market, and your customers.

Read your reject pile. It is trying to tell you something.

— Patrick

Read the Spylandia investigation: dugganusa.com/post/one-ip-one-script-100-000-requests-who-is-polling-our-stix-feed-from-the-space-coast

Audit your AI presence: aipmsec.com

Her name was Renee Nicole Good.

His name was Alex Jeffery Pretti.

The cheapest, fastest, most accurate threat feed on the internet.

275+ enterprises pulling daily. 1M+ IOCs. 17.4M indexed documents. We beat Zscaler by 43 days on NrodeCodeRAT. Starter tier $9/mo — less than any competitor’s sales demo.

Look up an IOC → · Audit your brand on AIPM → · See pricing →