Now in Public Beta

Detect LLM
Hallucinations

Trace every LLM call, evaluate output alignment with your system prompts, and get real-time alerts when your AI goes off-script.

Near-instant eval — avg up to 2000+ TPS

Start Free — 10M tokens/mo View Pricing

Integrate in under 5 minutes

quickstart.py

# pip install hallutraceai
from hallutraceai import HalluTrace

ht = HalluTrace(api_key="sk_live_...")

ht.trace(
    session_id="chat-123",
    type="agent",
    input="What is Python?",
    output="Python is a programming language.",
    system_prompt="You are a helpful assistant."
)
# Non-blocking — returns instantly. We evaluate in background.

Works with any LLM provider

OpenAIAnthropicGoogle GeminiMistralCohereLangChainLlamaIndexn8nFlowiseAIHugging FaceAny LLMOpenAIAnthropicGoogle GeminiMistralCohereLangChainLlamaIndexn8nFlowiseAIHugging FaceAny LLM

134+

Evals/sec

<1.2s

Latency

99.9%

Uptime

Dashboard

Catch Every Hallucination in Real Time

Live scores, detection trends, model comparisons, and session monitoring — updating as your LLM runs.

Project: My AI Chatbot

Last 7 days overview

Healthy1,247 evals

Avg Score

12.4

Sessions

342

Flagged

Messages

2.8K

Hallucination Score Distribution

0-10

872

11-30

561

31-50

312

51-70

149

71-100

Eval Detection Trends (Last 7 Days)

HallucinationToxicityBiasPII

50250

Topic Detection

Detecting topics...

Custom Detection Breakdown

Clean68%

Hallucination14%

Toxicity7%

Bias5%

PII Leak4%

Off-Topic2%

Model Comparison

GPT-4o

12%

Claude 3.5

Gemini Pro

18%

Mistral L

22%

Llama 3

16%

Halluc.Toxic.Bias

RAG Faithfulness Score

Context: 91%Source: 84%Grounded: 87%

Sentiment & Tone Analysis

Professional

42%

Friendly

28%

Neutral

18%

Formal

Negative

Response Cost Analytics

GPT-4o

$12512.4K

Claude 3.5

$898.9K

Gemini Pro

$4615.2K

Mistral L

$286.1K

Total: $28842.6K requestsAvg: $6.76/1K

Prompt Compliance Score

Format

96%

Tone

92%

Boundaries

94%

Instructions

91%

Anomaly Detection

Threshold

SPIKE

00:0004:0008:0012:0016:0020:0023:00

2 anomalies detectedAuto-alert triggered

Session	Latest Message	Msgs	Score	Model	Notification
chat-52fc	What is fair use in copyright law?	5	24	Gemini	—
chat-3927	What is fair use in copyright law?	14	17	Claude (Thinking)	—
chat-c954	What is the meaning of life?	13	44	Claude	—
chat-25fe	Explain the history of jazz music	16	72	GPT (Thinking)	Email Sent
chat-6284	Debug this Python code snippet	15	80	GPT	SMS Sent

Real-Time Hallucination Correction

HalluTrace scans every agent response against your RAG sources and system prompt in real time. When hallucination or prompt violation is detected, it automatically signals your agent to retry — before the user ever sees a wrong answer.

Pay as You Go & MAX-T

Internal Policy Assistantsession: chat-a4f2

Starting demo...

HalluTrace Monitor

Waiting for traces...

Auto-Scan Every Response

Every agent reply is checked against RAG sources, system prompt, and context — inputs, outputs, and metadata.

Detect & Intercept

Catches RAG data mismatches and system prompt violations. Score above threshold triggers correction.

Auto-Correct & Verify

Signals your agent to retry with source grounding. Re-scans the corrected response before delivery.

Features

Everything you need to trust your LLM

From trace ingestion to hallucination scoring to real-time alerts — one platform to monitor and evaluate your AI outputs.

Real-Time Tracing

Capture every LLM call — inputs, outputs, system prompts, model names. Grouped by chat session automatically.

Hallucination Detection

LLM-as-judge evaluates if outputs align with your system prompts. Scores from 0 (perfect) to 100 (hallucinated).

Instant Alerts

Get notified via email, SMS, or webhook when hallucination scores exceed your threshold. Default at 50.

Rich Analytics

Score trends, distributions, model comparisons, session breakdowns — all with animated, interactive charts.

CSV Data Tables

No SDK? Upload CSV files with your LLM data and run hallucination checks directly from the dashboard.

Simple Integration

3 lines of Python. Or use our REST API. Or swap your OpenAI base URL. Works with any LLM provider.

How It Works

Three steps to hallucination-free AI

Integrate SDK

Install our Python or JS SDK. Add 3 lines of code. Every LLM call is now traced — inputs, outputs, system prompts, and metadata.

Auto Evaluate

Our engine automatically scores each response for hallucination. LLM-as-judge checks alignment with your system prompt. Score 0-100.

Monitor & Alert

View scores in your dashboard with rich charts. Set thresholds. Get instant alerts via email, SMS, or webhook when things go wrong.

Pricing

Simple, token-based pricing

Start free with 10M tokens/month. Scale as you grow. All plans include Zero Storage mode. Paid plans include bring-your-own-model support.

Free

$0/ forever

Perfect for trying out hallucination detection.

Start Free

Infrastructure

10M tokens / month
Standard rate limit
Up to 1 Gbps uplink
128K tokens per trace

Platform

Unlimited projects
1 team member
7-day data retention
Zero Storage option
CSV data tables

Monitoring

Full analytics & charts
Email alerts
End-to-end encryption

Pay as You Go

$0.08/ per 1M input (128K)

Deposit $25 to start. Scale without limits. Pay only for what you use.

Get Started

Infrastructure

Unlimited tokens
Enhanced rate limit
Up to 2 Gbps uplink
Up to 2M tokens per trace *

Platform

Unlimited projects
1 team member
90-day data retention
Zero Storage option
CSV data tables

Pricing

128K model: $0.08/1M input, $0.18/1M output
2M model: $0.28/1M input, $0.68/1M output

Monitoring

Full analytics & charts
Email & SMS alerts
End-to-end encryption

Advanced Analytics

Topic Detection (auto-classify conversation topics)
Custom Detection Breakdown
Model Comparison analytics
RAG Faithfulness scoring
Sentiment & Tone analysis
Response Cost Analytics
Prompt Compliance scoring
Anomaly Detection with alerts

Bring Your Own Model

Custom OpenAI SDK /v1 endpoint
80% token discount with your own model

Max Tracing

MAX-T

$895/ per month

Max Tracing — full control, custom integrations, and dedicated support.

Start 30-Day Free Trial

Infrastructure

Highest priority rate limit
Up to 5 Gbps uplink
Up to 2M tokens per trace *
15% discount on all transactions

Platform

Everything in Pay as You Go
10 projects
365-day data retention
Zero Storage option
Unlimited team members

Advanced Eval

Go Beyond Hallucination (toxicity, bias, PII & more)
Custom Eval LLM Endpoint
Custom OpenAI SDK /v1 endpoint
80% token discount with your own model

Advanced Analytics

Topic Detection (auto-classify conversation topics)
Custom Detection Breakdown
Model Comparison analytics
RAG Faithfulness scoring
Sentiment & Tone analysis
Response Cost Analytics
Prompt Compliance scoring
Anomaly Detection with alerts

Integrations

Webhook integration with custom output
JSON export API

Dedicated Support

12 feature requests per year
Unlimited 24/7/365 email priority support

Zero Storage — data deleted after detection

Bring your own model — 80% off (paid plans)

No hidden fees. No minimum commitment.

Compare plans in detail

Hover over any feature to learn what it does.

Feature	Free	Pay as You Go	MAX-T
Usage & Limits
Monthly tokens Total tokens (input + output + system prompt + RAG context) processed per month.	10M	Unlimited	Unlimited
Rate limit How fast you can send trace data to our API. Higher tiers get priority throughput.	Standard	Enhanced	Highest priority
Server uplink speed Dedicated server connection speed for your account — faster uplink means lower latency for trace ingestion.	1 Gbps	2 Gbps	5 Gbps
Max tokens per trace Maximum token size for a single trace detection — includes input, output, system prompt, and RAG context combined. * 2M context models are subject to higher per-token pricing.	128K	Up to 2M *	Up to 2M *
Projects Separate environments for different apps or services, each with its own API key.	Unlimited	Unlimited	10
Team members Number of users who can access the dashboard and manage projects.	1	1	Unlimited
Data retention How long we store your trace data and evaluation results before auto-deletion.	7 days	90 days	365 days
Core Features
LLM call tracing Capture inputs, outputs, system prompts, and model info for every LLM call automatically.
Hallucination detection LLM-as-judge evaluates each response for alignment with your system prompt. Score 0–100.
Full analytics & charts Interactive dashboards with score trends, distributions, model comparisons, and session breakdowns.
End-to-end encryption Your LLM API keys and data are encrypted in transit. We never store keys in plain text.
Alerts & Integrations
Email alerts Get notified by email when hallucination scores exceed your configured threshold.
SMS alerts Receive SMS notifications for critical hallucination events in real time.
Webhook integration with custom output Send custom-formatted data to your webhook — integrate with Slack, PagerDuty, Jira, or any automation workflow.
JSON export API Programmatically export all your trace data and evaluation results as JSON for external analysis.
Data & Storage
CSV data tables Upload CSV files with your LLM data and run hallucination checks directly — no SDK needed.
Zero Storage option Your trace data is deleted immediately after hallucination detection runs. We store nothing.
Advanced & Customization
Custom OpenAI SDK /v1 endpoint Use your own LLM model via an OpenAI-compatible /v1 endpoint for hallucination evaluation.
80% token discount (BYOM) Bring your own model and we only charge for traffic & bandwidth — 80% off standard token pricing.
Go Beyond Hallucination Custom eval prompts — detect toxicity, bias, off-topic responses, PII leaks, or anything custom. You define the eval criteria and output format.
Topic Detection Automatically classify conversation topics — politics, healthcare, finance, tech, and more. See what your users are asking about in real time.
Custom Detection Breakdown Visual breakdown of all detection types — hallucination, toxicity, bias, PII leaks, and off-topic responses in one interactive chart.
Model Comparison analytics Compare hallucination rates, toxicity, and bias across different LLM models side by side.
RAG Faithfulness scoring Measure how faithfully your LLM responses match the provided RAG context — context match, source citation, and grounding scores.
Sentiment & Tone analysis Analyze the sentiment and tone of LLM responses — professional, friendly, neutral, formal, or negative.
Response Cost Analytics Track token costs per model, per request, and per project — see exactly where your LLM budget goes.
Prompt Compliance scoring Measure how well LLM outputs follow your system prompt instructions — format, tone, boundaries, and instruction adherence.
Anomaly Detection Automatically detect unusual spikes in hallucination scores and trigger alerts when anomalies are found.
Custom Eval LLM Endpoint Create a custom eval endpoint. We POST your trace data to your own model for evaluation — full control over the eval pipeline.
Volume discounts (15% off) Get 15% off total token pricing at scale — the more you use, the less you pay.
Support
Community support Access to documentation, guides, and community resources.
12 feature requests per year Submit up to 12 custom feature or integration requests per year. Send us a ticket with your idea and we'll build it for you.
Unlimited 24/7/365 email priority support Unlimited email priority support available around the clock for your critical needs. Separate from the 12 feature requests.

* 2M context model pricing: $0.28/1M input, $0.68/1M output. 128K model: $0.08/1M input, $0.18/1M output. MAX-T gets 15% off all transactions.

Start Free Talk to Sales

Stop hallucinations before your users notice

Join teams using HalluTrace AI to monitor, evaluate, and improve their LLM outputs. Start free with 10M tokens every month.

Get Started Free Talk to Sales

Detect LLM Hallucinations

Catch Every Hallucination in Real Time

Project: My AI Chatbot

Real-Time Hallucination Correction

Auto-Scan Every Response

Detect & Intercept

Auto-Correct & Verify

Everything you need to trust your LLM

Real-Time Tracing

Hallucination Detection

Instant Alerts

Rich Analytics

CSV Data Tables

Simple Integration

Three steps to hallucination-free AI

Integrate SDK

Auto Evaluate

Monitor & Alert

Simple, token-based pricing

Free

Pay as You Go

MAX-T

Compare plans in detail

Stop hallucinations before your users notice

Detect LLM
Hallucinations