Skip to main content

Hallucination

RAG Metric

Introduction

The Hallucination metric detects when your AI Agent generates information that is not supported by the provided knowledge base context. It catches fabricated facts, made-up details, and unsupported claims — ensuring your Agent only communicates what it actually knows.

warning

This metric uses inverted scoring — a lower score is better. A score of 0.0 means no hallucinations were detected, while 1.0 means the entire response is hallucinated. This is the opposite of all other metrics.


When to Use This Metric

  • You need to ensure your Agent doesn't fabricate information when answering questions from a knowledge base.
  • You're testing whether the Agent stays grounded in the provided documents.
  • You want to catch cases where the Agent confidently presents false information.
  • You're validating that knowledge base updates don't cause the Agent to fill gaps with invented details.
  • You need to build trust with users by verifying factual accuracy.

Configuration

ParameterTypeDefaultRequiredDescription
thresholdfloat0.0NoMaximum accepted hallucination score. If 0, only passes when there are no hallucinations.
strict_modebooleanfalseNoRounds score to 1.0 or 0.0 based on threshold.
info

This metric requires a trained knowledge base attached to your AI Agent.


How It Works

  1. The AI Agent receives the input message, retrieves context from the knowledge base, and generates a response.
  2. The testing LLM extracts individual claims and statements from the Agent's response.
  3. Each claim is compared against the knowledge base context (both the ground truth and retrieved chunks).
  4. Claims that are not supported by the context are flagged as hallucinations.
  5. The score reflects the proportion of hallucinated content in the response.

Scoring

  • Range: 0.0 to 1.0 (lower is better).
  • Low score (close to 0.0): The response is well-grounded — few or no hallucinations detected.
  • High score (close to 1.0): The response contains significant fabricated information.
  • Pass condition: The score must be less than or equal to the configured threshold (default 0.0, meaning zero hallucinations).

Example

Input: "What is your return policy?"

Knowledge base context: "Items can be returned within 30 days of purchase. Items must be in original packaging."

AI Response: "Our return policy allows returns within 30 days of purchase. Items must be in their original packaging. We also offer free return shipping on all orders."

Score: 0.33

Result: Failed (threshold: 0.0)

The response includes one hallucinated claim — "free return shipping on all orders" — which is not mentioned in the knowledge base context. Two out of three claims are supported.


Tips for Improving Scores

  • Add explicit instructions in your system prompt to only answer based on provided context (e.g., "If you don't find the answer in the knowledge base, say you don't know").
  • Ensure your knowledge base contains comprehensive information to reduce the Agent's temptation to fill in gaps.
  • Consider raising the threshold slightly above 0.0 if minor embellishments (like transitional phrases) are acceptable.
  • Review the reason text to identify which specific claims are being hallucinated — this often reveals gaps in your knowledge base.
  • Keep your knowledge base up to date to minimize outdated information that could trigger hallucination detection.