We Read Our Own AI Report Card Out Loud. Then We Ran the Same Test on Cribl.
- Patrick Duggan
- 19 minutes ago
- 6 min read
Microsoft started handing out report cards and most people have not noticed yet.
On February 11, 2026, Bing Webmaster Tools shipped a new section called AI Performance, in public preview. For the first time it shows publishers how often their content gets cited inside generative answers — Microsoft Copilot, the AI summaries that now sit at the top of Bing, and a handful of partner AI experiences. It surfaces the exact pages that get referenced, and it introduced a strange new unit of measurement called a grounding query: the reformulated question Copilot writes to itself, behind the scenes, when it decides it needs to go read the web before answering you. On June 16 they expanded it with intent labels, topic clusters, a Citation Share metric, and period-over-period comparison.
Translation: there is now an official, Microsoft-run scoreboard for whether the machines that are quietly replacing search can see you at all. So we did the uncomfortable thing. We pulled ours and read it out loud.
Our Report Card, Unedited
Here is aipmsec.com — our own AI-presence product, the brand whose entire job is helping companies show up correctly in AI. Bing's AI Performance export covers April 14 through June 28. Total Copilot citations in that window: eight. Seven of them landed on a single day, April 21. One more on May 13. Every other day is a zero, and the last six straight weeks — May 14 to June 28 — are an unbroken line of zeros.
Here is www.dugganusa.com, our main blog, 1,600-plus posts deep. Eleven citations total. One on April 13, one on April 29, a burst of seven on May 2, two on May 6. Then nothing. Seven and a half weeks of zero, May 7 through June 28.
That is not a typo and it is not a humble-brag with a twist ending. We sell AI-presence auditing, and Microsoft's own instrument says the large language layer barely cites us, and lately not at all. O'Toole's Axiom — Murphy was an optimist — applies to your own dashboards first.
So why publish it? Because the number that actually predicts whether a company will fix its AI visibility is not its citation count. It is whether the company is willing to look at its citation count. We look at ours. We are showing you ours. That is the entire thesis of the product in one screenshot: you cannot fix what you refuse to measure, and almost nobody is measuring this yet.
Then We Pointed The Instrument At Someone Bigger
Cribl is a very good company. It closed a 319-million-dollar Series E at a 3.5-billion-dollar valuation in August 2024, it became one of the fastest infrastructure companies ever to cross 100 million in annual recurring revenue, and it is trusted by something like half the Fortune 100. None of that is in dispute and we are not here to dispute it.
What caught our attention is the pitch. Cribl's whole 2026 narrative is about data accuracy for the AI era. Their framework states it as an equation — data integrity equals accuracy plus consistency plus context. Their messaging promises to transform raw telemetry into the AI-ready foundation your teams and your AI agents need to succeed. Their newer products lean on context-aware analysis that goes beyond simple pattern matching to ensure, in their words, precise, accurate identification. The thesis they are selling to every customer is: your data has to be accurate and richly contextual, or the AI cannot be trusted with it.
It is a good thesis. It happens to be exactly the question our AIPMSEC auditor was built to answer — just turned around and aimed back at the company selling it. So on June 30 we ran the audit. Same five-model council, same scoring, same day we scored ourselves. The instrument asks GPT-4o, Claude, Gemini, Mistral, and DeepSeek what they actually know about a company, checks their answers against the verifiable facts, and then separately measures how machine-legible the company's own website is — because that legibility is how the models ground themselves in the first place.
The Receipts, Side By Side
Same auditor, same five models, same morning. Every score is capped at 95, because we guarantee five percent of any confident-sounding number is nonsense, including ours.
Measure | dugganusa.com | aipmsec.com | cribl.io |
Overall AI-perception | 34 | 33 | 40 |
AI awareness (do models know you) | 35 | 35 | 51 |
AIPM-NPS (would models recommend you) | minus 33 | minus 33 | plus 33 |
AI accuracy (do models get your facts right) | 30 | 30 | 19 |
Site machine-readability | 87 | 86 | 63 |
Schema.org structured data | 85 | 85 | 5 |
Combined score | 55 | 54 | 49 |
Read the top of that table honestly, because we promised honesty. Cribl beats us where size and time buy you presence. The models are more aware of Cribl than of us — awareness 51 against our 35 — and when asked whether they would recommend Cribl for observability work, the council comes back net positive, a plus 33, while the same council scores us at minus 33. A minus 33 means the models either do not know us well enough to vouch for us or actively steer people elsewhere. We are a startup founded in late 2025. That is what the cold start looks like, and pretending otherwise would make the rest of this post worthless.
Now read the bottom of the table, because that is where the conceit lives.
On AI accuracy — the precise thing Cribl sells — the three models that answered scored Cribl at 19 out of 95, and two of the five council members, Gemini and Mistral, returned nothing usable about Cribl at all on that run. We scored 30. The tiny, barely-cited startup is represented more accurately inside the models than the 3.5-billion-dollar accuracy evangelist. Only one model, Claude, recovered Cribl's basic facts cleanly — the 2018 founding, the three founders, San Francisco, Cribl Stream — and it only got there because it stopped reasoning from memory and ran a live web search mid-answer.
And then the line that ties the whole thing together. Structured data. Schema.org markup is the single most direct, machine-verifiable way for a website to tell a model this is who we are, this is what we make, in a format built specifically for machines to ground on. It is the literal accuracy-and-context layer, shipped as code on your own pages. Our score is 85. Cribl's is 5.
A company whose pitch to customers is that raw telemetry lacks business meaning and must be enriched with metadata before a machine can trust it — service owner, criticality, region, business unit — publishes a corporate website with almost no business metadata for a machine to read. They left the AI crawlers fully welcome at the door, robots.txt wide open to GPTBot and ClaudeBot and PerplexityBot, an llms.txt file in place. And then gave those invited machines five out of ninety-five worth of structured ground truth to stand on. That is the conceit, measured: enrich your data so the AI can trust it, said the site the AI cannot accurately read.
What Both Halves Are Actually Telling Us
The Bing report and the AIPMSEC audit are two instruments pointed at the same disease, and they agree.
Bing says: even the company that sells AI presence is getting almost no Copilot citations, and so, we would wager, are you. This is not a DugganUSA problem. The generative layer is citing a vanishingly small slice of the web, and the days of zeros in our export are the days of zeros in nearly everyone's export. AI visibility in mid-2026 is a near-empty field, which means it is the most winnable field on the internet right now.
AIPMSEC says: the way you win it is not by buying awareness, it is by becoming legible to machines. Cribl bought awareness — they are bigger, older, better-funded, and the models know their name. We have not bought any. But on the two things that determine whether a machine can describe you correctly rather than just recognize your logo — structured data and factual accuracy — the scrappy startup that built the auditor outscores the giant that preaches the gospel. Combined score, the one that weights machine-legibility into perception, lands us at 55 and Cribl at 49. We will take that trade every day, because awareness compounds slowly and machine-legibility is a weekend of work.
We are not going to tell you we won. We got eight citations and a minus 33. We are going to tell you something more useful: Microsoft just started grading this for free, the whole class is failing, and the answer key is structured data and honest measurement. We published our own failing marks to prove we read the test. Cribl, with a hundred times our resources and a thesis that is literally about this, scored a 5 on the one section that is pure homework.
Open your own Bing Webmaster Tools. Find the AI Performance tab. Read your citation count out loud. If it is a column of zeros, you are not behind — you are early, and so is almost everyone. Then go fix the layer the machines actually read. That is the whole game now, and the scoreboard finally exists.
Methodology, for the skeptics: Bing AI Performance figures are the official Citations and Cited Pages exports for each verified property, April through June 28, 2026. AIPMSEC scores come from our five-model council audit run June 30, 2026, all three domains on the same configuration; two council models returned no usable answer for Cribl on that run, which deflates Cribl's accuracy and awareness averages and is disclosed here rather than hidden. All scores are capped at 95 on principle. We will happily re-run any of it on camera.
How do AI models see YOUR brand?
AIPM has audited 250+ domains. 15 seconds. Free while still in beta.
