top of page

VulnHalla: 7 CVEs in 2 Days for $80

  • Writer: Patrick Duggan
    Patrick Duggan
  • Dec 21, 2025
  • 3 min read

--- title: "VulnHalla: 7 CVEs in 2 Days for $80 - CyberArk's LLM + CodeQL Monster" slug: vulnhalla-cyberark-codeql-llm-vuln-hunting date: 2025-12-21 author: Patrick Duggan tags: [vulnhalla, cyberark, codeql, llm, vulnerability-hunting, cve, open-source] category: Threat Intelligence featured: false ---


The Problem With Static Analysis


CodeQL is powerful. It finds bugs. It also finds thousands of "bugs" that aren't actually exploitable. Security teams drown in false positives. Real vulnerabilities hide in the noise.


CyberArk Labs just open-sourced a solution: VulnHalla.




What VulnHalla Does


VulnHalla layers LLM reasoning on top of CodeQL results. It solves two fundamental problems:


The WHERE Problem: LLMs don't know which code sections deserve attention in million-line codebases. CodeQL tells them where to look.


The WHAT Problem: Even when examining the right location, models need to know what bug class to seek. CodeQL provides the category; LLMs assess exploitability.


The magic is in the combination. CodeQL generates alerts. VulnHalla retrieves context in ~3 seconds per finding. Then it asks "guided questions" that simulate how experienced researchers think:



• Where are variables declared and what sizes do they have?

• Do those sizes change between declaration and use?

• What constraints exist on source/destination buffers?

• Can an attacker control the inputs?


This isn't pattern matching. This is reasoning about exploitability.




The Results: 7 CVEs in 2 Days


| CVE | Project | Impact | |-----|---------|--------| | CVE-2025-38676 | Linux Kernel | Memory corruption | | CVE-2025-0518 | FFmpeg | Buffer overflow | | CVE-2025-27151 | Redis | Memory safety | | CVE-2025-8854 | Bullet3 | Physics engine vuln | | CVE-2025-9136 | RetroArch | Emulator exploit | | CVE-2025-9809 | Libretro | Core library issue | | CVE-2025-9810 | Linenoise | Input handling |


Total cost: Under $80 in LLM API calls.


Time invested: Two days.


Projects scanned: 100 large C repositories.




The False Positive Massacre


Raw CodeQL output is noisy. VulnHalla filters it:


| Bug Class | False Positive Reduction | |-----------|-------------------------| | Operator precedence logic errors | 96% | | Pointer offset before check | 91.96% | | Copy function using source size | 91.53% | | Missing null checks | 89.26% |


Human verification on the remaining findings: 62% were genuine issues.


Compare that to typical static analysis tools reporting 80%+ false positives. VulnHalla inverts the ratio.




Why This Matters


For Security Teams


You can now run CodeQL on your codebase and actually triage the results. The LLM layer separates "technically matches pattern" from "actually exploitable."


For Bug Bounty Hunters


$80 and two days got CyberArk seven CVEs including a Linux Kernel vulnerability. The ROI is absurd.


For Threat Intel


These CVEs are now indexed. If we see exploitation attempts targeting FFmpeg CVE-2025-0518 or Redis CVE-2025-27151, we know the lineage. VulnHalla-discovered bugs will show up in the wild.




The Technical Innovation


The CSV pre-indexing approach is clever:


1. Extract all functions, structs, and globals into CSV files before scanning 2. Run CodeQL queries to generate security alerts 3. Retrieve relevant code context in ~3 seconds per finding 4. No repository clones needed 5. No compiler access required


This eliminates multi-minute CodeQL query latencies. You can process hundreds of findings without waiting for database rebuilds.




The Pattern Recognition


This is the same architecture we use for threat intelligence:


VulnHalla: CodeQL (structured detection) → LLM (reasoning about exploitability) → CVEs


DugganUSA: ThreatFox/VirusTotal (structured detection) → LLM correlation (reasoning about attribution) → IOCs


Layer reasoning on top of structured data. Let the detection engine find candidates. Let the LLM assess significance.


CyberArk applied it to vulnerabilities. We applied it to threat actors. Same pattern, different domain.




Get VulnHalla


Repository: Check CyberArk Labs GitHub (link in original research)


Original Research: https://www.cyberark.com/resources/threat-research-blog/vulnhalla-picking-the-true-vulnerabilities-from-the-codeql-haystack


Cost to run: ~$80 for meaningful results across 100 repos




The CVEs We're Tracking


These seven CVEs are now in our index:



CVE-2025-38676 - Linux Kernel
CVE-2025-0518  - FFmpeg
CVE-2025-27151 - Redis
CVE-2025-8854  - Bullet3
CVE-2025-9136  - RetroArch
CVE-2025-9809  - Libretro
CVE-2025-9810  - Linenoise


If exploitation attempts surface, we'll correlate. The VulnHalla paper gives us the vulnerability details. Our infrastructure gives us the exploitation telemetry.




Credit Where Due


CyberArk Labs shipped this as open source. They published the methodology. They shared the CVEs.


This is how security research should work: find bugs, report them, publish the technique, let others build on it.


We're signal boosting because good research deserves attention.




*DugganUSA indexes threats. CyberArk finds the vulnerabilities that become threats. The loop closes.*





• VulnHalla Research: https://www.cyberark.com/resources/threat-research-blog/vulnhalla-picking-the-true-vulnerabilities-from-the-codeql-haystack

• Our Threat Intel: https://analytics.dugganusa.com/api/v1/stix-feed

• CVE Index: Now tracking VulnHalla discoveries



Get Free IOCs

Subscribe to our threat intelligence feeds for free, machine-readable IOCs:

AlienVault OTX: https://otx.alienvault.com/user/pduggusa

STIX 2.1 Feed: https://analytics.dugganusa.com/api/v1/stix-feed


Comments

Rated 0 out of 5 stars.
No ratings yet

Add a rating
bottom of page