Evaluation Methodology: Fact-Checking
Method Overview​
The Fact-Check Evaluation method is specifically designed to verify the accuracy, clarity, and reliability of responses generated by AI Agents within Retrieval-Augmented Generation (RAG) workflows. By comparing the Agent's responses directly to retrieved context, this approach ensures that generated content is factual and free from hallucinations.
In Fact-Check evaluations, each test case involves:
- A single LLM agent response generated using Retrieval-Augmented Generation (RAG).
- A set of retrieved knowledge chunks serving as a ground-truth reference for the agent’s response.
The judge then assesses the response based on these retrieved knowledge chunks, evaluating key metrics such as Completeness, Faithfulness and Non-Hallucination
The results of Fact-Checking Evaluation can be used for prompt tuning or to select better RAG approach for your specific use case.