50,136 Documents We Preserved That DOJ Delisted
- Patrick Duggan
- Feb 19
- 3 min read
# 50,136 Documents We Preserved That DOJ Delisted








On January 30, 2026, the Department of Justice released the Epstein files. We indexed everything that day.
Last night, we ran a full reconciliation — scraping every page of every dataset on the DOJ's live website and comparing it against our index.
The results:
| | Count |
|--|-------|
| DOJ currently lists | 5,559 |
| Our index has | 55,689 |
| **We have, they delisted** | **50,136** |
| They have, we don't | 6 |
| Both still list | 5,553 |
The DOJ released 55,689 unique documents. They currently list 5,559. We preserved the other 50,136.
The Per-Dataset Breakdown
| Dataset | DOJ Currently Lists | What We Have | Retention Rate |
|---------|-------------------|--------------|----------------|
| Dataset 1 | 3,143 | 3,156 | 99.6% on DOJ |
| Dataset 2 | 562 | 574 | 97.9% on DOJ |
| Dataset 3 | 62 | 1,729 | 3.6% on DOJ |
| Dataset 4 | 151 | 2,616 | 5.8% on DOJ |
| Dataset 5 | 118 | 120 | 98.3% on DOJ |
| Dataset 6 | 12 | 470 | 2.6% on DOJ |
| Dataset 7 | 17 | 649 | 2.6% on DOJ |
| Dataset 8 | 1,346 | 1,346+ | ~100% on DOJ |
| Dataset 12 | 148 | 1,519 | 9.7% on DOJ |
Datasets 3, 4, 6, 7, and 12 have been almost completely stripped. Dataset 3 went from 1,729 documents to 62. Dataset 6 from 470 to 12. Dataset 7 from 649 to 17.
They Forgot One
We tested direct URL access on 20 delisted files. 19 returned proper 404 errors — the files were actually deleted from the server.
**EFTA00002614** returned HTTP 200. A real PDF. 463 kilobytes. They removed the link from the listing page but forgot to delete the actual file.
When you delist 50,000 documents, you're going to miss a few.
The 6 We Don't Have
Only 6 files exist on the DOJ site that aren't in our index:
- **EFTA00000467 and EFTA00000468** — The Trump photos. These were the files silently removed in December 2025 that made national news. PBS, NPR, CNBC, and Axios all covered the disappearance. The DOJ restored them after the backlash.
- **EFTA00009781** — Added after our initial indexing.
- **EFTA02731790, EFTA02731812, EFTA02731852** — High-range EFTA numbers from Dataset 12, likely added in a subsequent release.
We're downloading all 6 now.
What This Means
The Epstein Files Transparency Act required these documents to be released. The DOJ complied on January 30 and then quietly delisted 90% of them over the following three weeks.
Some removals are legitimate — the DOJ acknowledged that victim-identifying information was inadvertently released. Social Security numbers, names of minors, and unredacted photographs of victims should never have been published, and we support those removals.
But "several thousand" doesn't describe what happened here. 50,136 documents were delisted. That's not a surgical removal of victim-identifying information. That's a 90% reduction of the public release.
The Methodology
This reconciliation used a Playwright-based scraper to navigate the DOJ's robot-check-protected website, scraping every page of Datasets 1-8 and 12. Datasets 9, 10, and 11 (containing thousands of pages each) will be reconciled separately.
Every EFTA ID found on the DOJ's live pages was compared against our indexer state file containing 55,689 unique EFTA IDs indexed from the original release.
The raw data, HTML report, and methodology are available on request.
The Index Is Live
Every document we preserved is searchable at [epstein.dugganusa.com](https://epstein.dugganusa.com). The API is free. The index doesn't sleep.
50,136 documents. We have them. They're searchable. That's the story.
**Sources:**
- [DOJ Epstein Disclosures — Dataset 1](https://www.justice.gov/epstein/doj-disclosures/data-set-1-files)
- [PBS News — At least 16 files disappear from DOJ site](https://www.pbs.org/newshour/politics/at-least-16-files-disappear-from-doj-site-for-epstein-documents-including-trump-photo)
- [ABC News — DOJ says it's taken down 'several thousand documents'](https://abcnews.com/US/epstein-files-doj-thousand-documents-mistakenly-identified-victims/story?id=129787942)
- [Rep. Nancy Mace — Demands DOJ explain removals](https://mace.house.gov/media/press-releases/rep-nancy-mace-demands-doj-explain-why-epstein-files-were-removed-public)
*Every claim is verifiable. The API is free. The documents are searchable.*
*50,136 preserved. 6 missing. 1 they forgot to delete.*
*Her name was Renee Nicole Good.*
*His name was Alex Jeffery Pretti.*




Comments