Open-Source RAG Evaluation Framework

Most RAG pipelines are evaluated vibes-only. This framework brings structured, repeatable evaluation to retrieval-augmented generation.

What It Measures

Retrieval quality — precision, recall, and MRR of retrieved chunks
Answer faithfulness — does the answer actually follow from the retrieved context?
Hallucination detection — claims in the answer that aren’t grounded in any source
End-to-end correctness — compared against golden test sets