A RAG benchmark
for the real world.

The first RAG benchmark built on company-internal data, not Wikipedia. 500K+ docs, 500 questions, 9 enterprise sources. Open-source, MIT-licensed. By Onyx.

Put your RAG to the test HuggingFace Leaderboard

Show

Talk to the team that built the benchmark

Run this on your data.

Same retrieval, same agentic refinement, same connectors, on your Slack, Drive, Jira, Confluence, and the rest of your stack. Bring a use case. We'll show you what it looks like.

Book a Demo

Read the paper

§2 · Inside the dataset

We generated a realistic synthetic company with documents across 9 different sources.

511,962

documents across 9 sources

How it's built

03 · Add noise

Inject realistic distractors. Off-topic threads, half-finished drafts, near-duplicate pages. The clutter retrieval has to ignore.

methodology.md ↗

§3 · Compare

Head to Head

We evaluated different RAG products and frameworks on the benchmark questions to see how they stack up. See where each one wins across the different question types.

§4 · Citation

@misc{sun2026enterpriseragbench,
  title        = {EnterpriseRAG-Bench: A RAG Benchmark for Company Internal Knowledge},
  author       = {Sun, Yuhong and Rahmfeld, J. and Weaver, Chris and Desai, Roshan
                  and Huang, Wenxi and Chen, Weijia and Butler, Mark H.},
  year         = {2026},
  howpublished = {\url{https://github.com/onyx-dot-app/EnterpriseRAG-Bench}},
  note         = {Draft. Final paper forthcoming.}
}

Think your RAG can beat the field?

Run the 500-question test set against your system. Email results to joachim@onyx.app. We verify, then post you on the public leaderboard.

Built by Onyx · Open-source, MIT-licensed

Open source

Ship a script or notebook that re-runs your system on the released corpus. We re-run it, you get scored.

Submit your repo

Closed source

Point us at a sandbox or API endpoint. We hit it with the question set and verify your numbers.

Submit your endpoint

huggingface.co/datasets/onyx-dot-app/EnterpriseRAG-Bench

huggingface.co/spaces/onyx-dot-app/EnterpriseRAG-Bench-Leaderboard

github.com/onyx-dot-app/EnterpriseRAG-Bench

A RAG benchmarkfor the real world.

We generated a realistic synthetic company with documents across 9 different sources.

Head to Head

A RAG benchmark
for the real world.