Evaluation Methodology: Fact-Checking

Method Overview

The Fact-Check Evaluation method is specifically designed to verify the accuracy, clarity, and reliability of responses generated by AI Agents within Retrieval-Augmented Generation (RAG) workflows. By comparing the Agent's responses directly to retrieved context, this approach ensures that generated content is factual and free from hallucinations.

In Fact-Check evaluations, each test case involves:

A single LLM agent response generated using Retrieval-Augmented Generation (RAG).
A set of retrieved knowledge chunks serving as a ground-truth reference for the agent’s response.

The judge then assesses the response based on these retrieved knowledge chunks, evaluating key metrics such as Completeness, Faithfulness and Non-Hallucination

The results of Fact-Checking Evaluation can be used for prompt tuning or to select better RAG approach for your specific use case.

Method Overview​

Method Overview