πŸŽ₯ Interactive Walkthrough

Explore the Reliability Monitoring dashboard in action:


πŸ“Š Key Metrics Overview

Monitor the most important indicators of AI reliability:

  1. Total Volume – The number of queries processed.
  2. Reliability Index – A measure of how consistently your AI performs without failures or inaccuracies.
  3. Factual Error Rate – The percentage of responses that were factually incorrect.
  4. Contextual Error Rate – The frequency at which responses are incorrect due to misinterpretation of context.

πŸ“ˆ Critical Summaries

Understand trends over time with visual insights:

  • Total Queries vs. Flagged Queries View a time-series graph that compares the number of total queries with the number of flagged ones.

  • Flagged Categories Categorization of flagged queries to identify common reliability issues.


🧠 Hallucination Categories

Identify and classify the types of hallucinations detected in AI responses:

  • Logical – Incorrect reasoning or fallacies.
  • Temporal – Misstatements about time, dates, or sequencing.
  • Entity – Inaccuracies about named entities (people, places, organizations).
  • Contextual – Misunderstandings within conversational context.
  • Other – Unclassified or ambiguous errors.

πŸ” Live Data

Access detailed evaluation records in real time:

FieldDescription
IDUnique identifier for the conversation or query
Conversation PrefixInitial input or prompt given to the assistant
Model ResponseThe generated response from the Model
Judge OutputEvaluation or score given by the system or human judge
FeedbackReviewer feedback
CategoriesLabeled types of hallucination or issues