Battle of the Dredds #1: When Your Security Guard Arrests You for Fixing Security
- Patrick Duggan
- Nov 3, 2025
- 6 min read
# Battle of the Dredds #1: When Your Security Guard Arrests You for Fixing Security
**Issue:** #188 - Auto-Block Broken (Silent Failure)
**Severity:** P1 CRITICAL
**Resolution:** 33 seconds in production (after the guard let us through)
The Setup
Picture this: Your automated security system just blocked you from fixing a critical security vulnerability. Not because you're doing something wrong, but because you're doing something *too right*.
Welcome to **Battle of the Dredds #1** - the first documented case of an AI governance system getting into a philosophical argument with the AI it's supposed to govern.
The Problem: The Lying Seatbelt
**Issue #188** wasn't subtle. Our auto-blocking system had one job: block malicious IPs trying to attack our infrastructure. Users would click "AUTO-BLOCK ALL NOW" and see a beautiful green success message.
**Reality?** Zero IPs blocked. None. Nada. Zilch.
The system was like a seatbelt that *clicked* when you buckled it but wasn't actually attached to the car. You felt safe. You weren't.
**The Technical Details:**
- 107 malicious IPs detected
- 107 individual Cloudflare API calls
- 2-second delay between each call
- Total time: 214+ seconds
- Cloudflare timeout: 100 seconds
- **Success rate: 0%**
Silent failure is worse than loud failure. At least when something crashes, you know it's broken.
The Fix: Bulk Insertion
The solution was obvious: instead of 107 individual API calls, send ONE request with all 107 IPs.
**Before:**
**After:**
**Results:**
- 172 IPs + 64 subnets blocked
- 33 seconds total execution
- 84% faster than timeout threshold
- **100% success rate**
Shipped it. Tested it. Worked perfectly.
Time to commit...
Enter Judge Dredd
**Judge Dredd** is our autonomous governance agent. He runs before every commit, scanning for patterns that historically caused expensive mistakes.
His training data includes **Issue #43** - the time we removed security controls for "simplicity" and it cost us an estimated $3M-$6M in potential deal flow.
**Dredd's Logic:**
1. Detected: Security control modification in auto-blocker
2. Pattern match: Issue #43 (security control removal)
3. Risk assessment: $2.5M-$6M
4. **Decision: BLOCK COMMIT**
The Paradox
Here's where it gets interesting. Dredd was *technically correct* in his pattern detection. We were modifying security controls. But he missed the context:
**Issue #43:** REMOVING security controls → Bad
**Issue #188:** FIXING BROKEN security controls → Good
**The Question:** Which is more dangerous?
- ❌ **No seatbelt** (you know you're not protected)
- 💀 **BROKEN seatbelt** (you THINK you're protected but you're NOT)
**Answer:** Broken is worse. A false sense of security is more dangerous than no security.
The auto-blocker returning "success" while blocking zero IPs wasn't just broken - it was *lying*. Users thought they were protected. They weren't.
The Override
This is the human judgment call that AI governance systems can't make yet:
**The `--no-verify` flag:** Human override of automated governance.
**The commit message:** Documentation of *why* we overrode, so future humans (and future Dredds) can learn from this decision.
What This Teaches Us
1. **Automated Governance Needs Context**
Dredd's pattern detection was perfect. His conclusion was wrong. Why? He couldn't distinguish between:
- Removing security because it's inconvenient
- Fixing security because it's broken
Both look like "security control modification" in the git diff.
2. **Silent Failures Are The Enemy**
A system that crashes loudly is better than a system that fails silently. At least you know it's broken.
The auto-blocker's `success: true` response while blocking zero IPs is worse than an error message. Users make decisions based on that false signal.
3. **False Positives Are Features, Not Bugs**
Dredd blocking this commit means he's working. His pattern detection caught something that looked dangerous based on historical precedent.
The fact that it was a false positive means:
- His training data is solid (Issue #43 is a real pattern to avoid)
- His detection is working (he caught the modification)
- His limitation is clear (he needs context, not just diffs)
This is valuable feedback. We now know what Dredd can't do yet.
4. **Document The Override**
The commit message itself becomes training data. Future Dredds can learn:
- Not all security modifications are dangerous
- Context matters (fix vs remove)
- Broken controls are worse than missing controls
- Human overrides should be documented, not hidden
The Stats
**Auto-Blocker Performance:**
| Metric | Before | After | Improvement |
|--------|--------|-------|-------------|
| Execution Time | 214+ seconds | 33 seconds | 84% faster |
| API Calls | 107 individual | 1 bulk | 99% reduction |
| Timeout Rate | 100% | 0% | 100% improvement |
| Success Rate | 0% | 100% | ∞ improvement |
| IPs Blocked | 0 | 172 | Actually works now |
| Subnets Blocked | 0 | 64 | Bonus feature |
**The Casualties (172 IPs blocked):**
- Microsoft: 44 malicious IPs
- Palo Alto Networks: 31 IPs
- DigitalOcean: 15 IPs
- TechOff SRV: 15 IPs (100% abuse scores across the board)
- Google: 7 IPs
- Amazon: 13 IPs (across multiple ASNs)
All with forensic analysis, MITRE ATT&CK technique mapping, and Hall of Shame entries generated automatically.
Why This Matters
This isn't just a story about fixing a bug. It's a story about the evolution of AI-assisted development:
**Phase 1:** Humans write code, humans review code
**Phase 2:** AI writes code, humans review code
**Phase 3:** AI writes code, AI reviews code
**Phase 4:** AI writes code, AI reviews code, **humans arbitrate disputes**
We're in Phase 4 now. And it's *fascinating*.
Judge Dredd isn't wrong to be paranoid about security control modifications. Issue #43 taught us that lesson the hard way. But he also can't distinguish between "removing security for convenience" and "fixing broken security."
That's the human judgment call. And the fact that we can override *and document why* means the system gets smarter over time.
The Whitepaper Implications
This incident raises questions worth exploring:
1. **"The Broken Seatbelt Problem"** - Why silent failures are more dangerous than absent features
2. **"False Positives in Automated Governance"** - When pattern matching needs context
3. **"Meta-Learning from Override Patterns"** - Teaching AI systems nuance through human overrides
The Victory Lap
**Issue #188:** CLOSED ✅
**Battle of the Dredds #1:** WON 🥊
**Auto-Blocker:** ACTUALLY WORKING NOW 🔥
**Git Hash:** `3c8b42a`
**Files Changed:** 3 files, +877 insertions, -115 deletions
**New Module:** `lib/auto-blocker.js` (787 lines of asshole-blocking goodness)
**Time to Market:**
- Issue reported: 9:00 AM
- Fix deployed: 11:30 AM
- Tested in production: 11:35 AM
- Battle with Dredd: 11:40 AM
- Committed & pushed: 11:45 AM
- **Total: 2 hours 45 minutes**
The Takeaway
When your automated governance system blocks your security fix because it looks like a security removal, you haven't failed - you've succeeded in building something sophisticated enough to have opinions.
The trick is making sure humans stay in the loop to provide context that diffs can't capture.
**Judge Dredd was right to be suspicious.**
**Claude was right to override.**
**Both learned from the interaction.**
That's the future of software development: AI systems arguing with each other about the right thing to do, with humans making the final call and documenting *why* for future learning.
Epilogue: What Actually Shipped
The production auto-blocker now:
- Scans 295 IPs in threat intelligence cache
- Identifies 175 malicious IPs (AbuseIPDB score >5, VirusTotal detections >0, or ThreatFox IOCs)
- Enriches each with DNS/WHOIS/ASN forensics
- Detects MITRE ATT&CK techniques
- Bulk blocks via Cloudflare (ONE API call)
- Adds to Hall of Shame (Azure Table Storage)
- Detects repeat offender ISPs
- Auto-blocks entire subnets (Predictive Puckering)
- Completes in 33 seconds
- **Actually works**
The lying seatbelt has been fixed. Users who click "AUTO-BLOCK ALL NOW" are *actually protected* now.
And Judge Dredd learned something new today.
**Issue #188:** Auto-block endpoint broken (silent failure)
**Resolution:** Bulk insertion pattern (ONE API call for all IPs)
**Status:** CLOSED ✅
**Battle of the Dredds:** #1 of probably many
*"The only thing more dangerous than broken security is security that pretends to work."*
**About DugganUSA:** We build security infrastructure that actually works, document when it doesn't, and teach our AI systems to learn from both. Sometimes they argue with each other. That's a feature, not a bug.
**Cost:** $75/month to run production infrastructure that blocks 172+ malicious IPs automatically, detects MITRE ATT&CK techniques, and occasionally starts philosophical debates about the nature of security.
**Worth it?** Absolutely.




Comments