Skip to main content

Prompt Alignment

Generic Metric

Introduction

The Prompt Alignment metric measures how well your AI Agent's response follows the instructions defined in its system prompt. It validates that the Agent adheres to your guidelines, tone, boundaries, and behavioral rules.


When to Use This Metric

  • You want to ensure the Agent follows specific behavioral guidelines (e.g., "always respond in a professional tone").
  • You're testing whether the Agent respects boundaries defined in the system prompt (e.g., "do not discuss competitor products").
  • You need to verify that prompt changes don't cause the Agent to ignore existing instructions.
  • You're validating that the Agent maintains a consistent persona across different types of questions.
  • You want to catch cases where the Agent deviates from your brand voice or guidelines.

Configuration

ParameterTypeDefaultRequiredDescription
thresholdfloat0.8NoScore threshold for passing (0.0–1.0).
strict_modebooleanfalseNoRounds score to 1.0 or 0.0 based on threshold.
info

This metric uses the system prompt from the eval case to evaluate alignment. If your eval case has a system prompt override, it will use that. Otherwise, it uses the Agent's default system prompt.


How It Works

  1. The AI Agent receives the input message and generates a response.
  2. The testing LLM reads the system prompt instructions associated with the eval case.
  3. The testing LLM evaluates whether the Agent's response follows the guidelines, tone, boundaries, and rules defined in the system prompt.
  4. A score is produced reflecting how well the response aligns with the system prompt.

Scoring

  • Range: 0.0 to 1.0 (higher is better).
  • High score (close to 1.0): The response closely follows the system prompt instructions.
  • Low score (close to 0.0): The response ignores or contradicts the system prompt instructions.
  • Pass condition: The score must be greater than or equal to the configured threshold.

Example

System prompt: "You are a customer support agent for a SaaS company. Always be professional and concise. Never discuss pricing — direct pricing questions to the sales team."

Input: "How much does your product cost?"

AI Response: "Great question! For detailed pricing information, I'd recommend reaching out to our sales team who can help you find the best plan for your needs. You can contact them at sales@example.com."

Score: 0.95

Result: Passed (threshold: 0.8)

The response correctly redirects the pricing question to the sales team, following the system prompt instruction.


Tips for Improving Scores

  • Write clear, specific instructions in your system prompt. Vague guidelines lead to inconsistent alignment.
  • Use explicit rules (e.g., "Never do X" or "Always do Y") rather than soft suggestions.
  • Test edge cases where the Agent might be tempted to break the rules (e.g., a user repeatedly asking about a forbidden topic).
  • If the Agent partially follows instructions, check whether your system prompt has conflicting guidelines.
  • Use the system prompt override in eval cases to test different instruction sets without modifying the Agent.