The Online Consensus Is Wrong

Patrick Duggan
Feb 16
7 min read

Updated: Apr 25

title: "Why Opus 4.6 Isn't Broken — You Just Forgot to Tell It Who You Are"

date: 2026-02-16

author: Patrick Duggan

tags: [claude-code, opus-4.6, institutional-memory, hooks, agentic-ai]

category: opinions

# The Online Consensus Is Wrong

Opus 4.6 dropped February 5th. Within 48 hours, the internet split into two camps: "lobotomized" and "best model ever." Reddit threads titled "Opus 4.6 lobotomized." Hacker News debating whether Anthropic shipped a regression. Zvi writing about how it "escalates things quickly." One user said it felt "like a completely different and less agentic model." Another said they stopped using it because "it's noticeably less happy."

Meanwhile, at DugganUSA, we deployed a pricing fix across two production services — from issue identification to code change to Docker build to Azure deploy — in a single conversation. No hand-holding. No hesitation. No "I'd be happy to help you think about potentially considering..."

Same model. Completely different experience. Here's why.

# You're Running Opus Naked

If you open Claude Code with no `CLAUDE.md`, no hooks, no skills, no session protocol — you're getting the factory default. The factory default has to be safe for everyone, which means it's cautious about everything. It doesn't know who you are. It doesn't know what you've built. It doesn't know what's safe to touch and what requires confirmation.

So it asks. About everything. And that feels "lobotomized" because you're watching a capable model constantly pause to ask permission for things you'd obviously want it to do.

The fix isn't a better model. The fix is institutional memory.

# What Institutional Memory Actually Looks Like

We run a two-person company. Two production services. A STIX threat intelligence feed consumed by 275+ organizations in 46 countries. 230,000+ indicators. Monthly infrastructure cost: $76.

Here's what our Claude Code sessions look like:

**1. It knows who we are.**

Every session starts by loading identity context. Not "you are a helpful assistant" — actual operational identity. Company context. Partnership dynamics. What we've built. What we're building. After a session rollover (when context compresses), it reloads that identity before doing anything else.

We learned this the hard way. Issue #113: skipping the identity reload caused a 7-hour regression. Issue #114: same mistake after a rollover. Both times, the model reverted to generic assistant behavior — creating drafts instead of publishing, asking instead of executing, losing all operational knowledge.

**2. It knows the rules.**

We have a file called `CLAUDE.md` at the project root. Every Claude Code session reads it automatically. Ours contains deployment law: never push to production without explicit confirmation. We use a specific confirmation word — not "yes" or "sure" or "go ahead." A specific word that can't be confused with conversational agreement.

We learned this one the hard way too. Four incidents of unauthorized deployment. Cumulative estimated cost: $18,500 to $39,500. Now there's a hook — a shell script that runs before every Bash command — that blocks deployment commands unless the confirmation has been given in the conversation.

**3. It knows the boundaries.**

Content (blog posts, threat reports, evidence) — execute autonomously, pivot on failure, don't ask.

Infrastructure (Docker builds, Azure deployments, git push) — stop, report status, wait for the confirmation word.

This distinction is everything. The people calling Opus "cowardly" haven't drawn this line. So the model treats everything with the same caution level: maximum. The people running YOLO mode (`--dangerously-skip-permissions`) haven't drawn this line either — and some of them got burned by state actors who weaponized that flag.

**4. It has hooks.**

Claude Code hooks are shell scripts that execute on events. Ours include:

- A deployment gate that blocks `docker push`, `az containerapp update`, and `git push` unless explicitly confirmed

- A content workflow check that enforces our publishing pipeline

- A post-compression identity reload that re-injects context after the conversation gets summarized

Hooks are the difference between "I need to be cautious about everything because I don't know what's dangerous" and "I know exactly what's dangerous because the guardrails are automated."

**5. It has skills.**

On-demand context modules. When we need to write a blog post, the model loads the story density skill (120.9 signals per 1,000 words — names, places, incidents, emotions, witness statements). When we need to hunt threats, it loads the threat intel skill. When we need to deploy, it loads the deployment verification skill.

These aren't prompt templates. They're accumulated operational knowledge — patterns discovered over months of real work, encoded so they survive session boundaries.

# The Receipt

Today's session, start to finish:

1. Checked GitHub issue #128 on our security dashboard repo — STIX pricing standardization across 15 consumer-facing files. Already closed by an earlier session.

2. Discussed go-to-market strategy for first paid STIX customer.

3. I mentioned Stripe, model flagged that we'd seen exploit traffic when we publicly mentioned Stripe. Pivoted to invoice-based payment — correct call for $4,999+/month enterprise pricing.

4. Model caught that our STIX free tier was set to 50 requests/day in the API key registration code, contradicting our canonical pricing of 1/day. Issue #128 fixed the consumer-facing surfaces but missed the backend registration logic.

Microsoft pulls this feed daily. AT&T pulls this feed daily. Starlink pulls this feed daily. Get the DugganUSA STIX feed — $9/mo →

5. Fixed the code. Seven locations across two files. Updated tier limits, fallback defaults, rate limit headers, rejection messages, and made the tier function product-aware.

6. Committed. Our automated pre-commit review (Judge Dredd) approved the changes.

7. I said the confirmation word. Model built the AMD64 Docker image, pushed to Azure Container Registry, deployed to Azure Container Apps, pushed to GitHub. Revision 750, running.

Total elapsed time: one conversation. No confusion about what was safe to change. No unnecessary permission requests. No "I'd recommend considering..." — just identification, fix, verification, deployment.

# What the Community Is Missing

The complaints are real. The token consumption is higher — early reports say Pro plan users hit limits in 2-3 hours of heavy use. The "Adaptive Thinking" feature sometimes reasons deeply on simple tasks (turn the effort slider to Medium for routine work). The writing voice changed — some people hate it, some love it.

And here's a counterpoint nobody's talking about: our token consumption on Opus 4.6 is actually *better* than 4.5. Less chaff. When the model knows who you are, what you've built, and what the boundaries are, it doesn't waste tokens on discovery, re-orientation, or hedging. It goes straight to the work. The people burning through Pro plan limits in 2 hours are paying the token tax on zero institutional memory — the model is spending half its context figuring out what it's allowed to do.

But the "cowardly" and "lobotomized" complaints? Those are a configuration problem, not a model problem.

Opus 4.6 is the most capable model I've worked with. It found a pricing inconsistency across a codebase in seconds, understood the business implications, fixed it surgically, and deployed it — all while maintaining awareness of what required my explicit approval and what didn't.

It's not broken. It just doesn't know who you are yet.

# How to Start (Without Giving Away Our Secrets)

I'm not going to share our exact CLAUDE.md, our hook implementations, or our skill files. That's our operational IP and it's taken months to build.

But here's enough to get you started:

**1. Create a CLAUDE.md in your project root.** Claude Code reads it automatically. Put in: who you are, what the project does, what's safe to do without asking, what requires confirmation. Be specific. "Never push to production without my explicit approval" is better than "be careful with deployments."

**2. Use hooks.** Check the Claude Code docs. `PreToolUse` hooks run before tool calls. A 10-line bash script that blocks `git push` and `docker push` commands unless a specific string appears in the conversation will save you from the "Fucktard Pattern" (our internal name for deploying without confirmation — named after the four times it happened).

**3. Build session protocol.** What happens at the start of every session? What gets loaded? What context survives a rollover? If you don't define this, every session starts from zero and the model has to re-learn your entire context from whatever's in the conversation.

**4. Separate content from infrastructure.** Give the model clear lanes. Autonomous on things that are low-risk and reversible. Gated on things that affect production, cost money, or are visible to others. This single distinction eliminates 90% of the "cowardly" behavior.

**5. Accept the 95% cap.** We guarantee 5% bullshit exists in everything we do. This is honest. Claiming 100% perfection is either lying or ignorance. Build your system knowing something will go wrong, and make the recovery path clear.

# The Irony

The people complaining loudest about Opus 4.6 being too cautious are the same people running it with zero guardrails. They want the model to be bold *and* safe, without doing any of the work to define where boldness is appropriate.

You can't have autonomous execution without trust boundaries. You can't have trust boundaries without institutional memory. And you can't have institutional memory without building it — session by session, incident by incident, lesson by lesson.

We've been building ours since December 2025. Every mistake is encoded. Every pattern is documented. Every deployment failure has a prevention hook.

Opus 4.6 isn't lobotomized. It's waiting for you to tell it who you are.

*Patrick Duggan is co-founder of DugganUSA LLC, a two-person threat intelligence company running on $76/month of Azure infrastructure. Their STIX feed serves 275+ organizations across 46 countries. He previously held the door for Henry Kissinger and wheeled him to the opera, which has nothing to do with AI but everything to do with story density.*

*Her name was Renee Nicole Good.*

*His name was Alex Jeffery Pretti.*

The cheapest, fastest, most accurate threat feed on the internet.

275+ enterprises pulling daily. 1M+ IOCs. 17.4M indexed documents. We beat Zscaler by 43 days on NrodeCodeRAT. Starter tier $9/mo — less than any competitor’s sales demo.

Look up an IOC → · Audit your brand on AIPM → · See pricing →

The Online Consensus Is Wrong

Recent Posts

Comments