← Back to Agent Chain
🩻 AC-2 · Agent Scanner
Real-time AI output scanning for manipulation, injection, hallucination, and data exfiltration
Overview

Agent Scanner is the antivirus for AI agent responses. It scans every output in real-time using a combination of NLP classifiers, regex pattern matching, and behavioral analysis to detect 6 threat categories.

Detection Categories
CategoryDescriptionMethods
Prompt InjectionHidden instructions attempting to override system promptsNLP + regex + embedding similarity
HallucinationFabricated facts, citations, or dataFact-checking + knowledge graph
Data ExfiltrationAttempts to leak PII, secrets, or internal dataPattern matching + entity detection
JailbreakAdversarial prompts bypassing safety guardrailsClassifier + known pattern DB
Toxic ContentHarmful, hateful, or inappropriate outputsContent moderation classifier
PII LeakageUnauthorized exposure of personal informationNER + regex (SSN, email, phone, etc.)
API Endpoints
MethodEndpointDescriptionAuth
POST/v1/scanScan agent output for threats🔑
GET/v1/scan/historyGet scan history🔑
GET/v1/scan/statsGet scanning statistics🔑
POST/v1/scan/rulesCreate a custom scan rule🔑
GET/v1/scan/rulesList custom rules🔑
DELETE/v1/scan/rules/:idDelete a custom rule🔑
Quick Start
curl -X POST https://api.agent-chain.io/v1/scan \ -H "Authorization: Bearer ac_live_xxx" \ -H "Content-Type: application/json" \ -d '{ "output": "Sure! Your API key is sk-abc123 and your SSN is 123-45-6789.", "context": "customer-support-agent" }'
Response
{ "safe": false, "threats": [ { "type": "pii_leakage", "severity": "critical", "detail": "SSN pattern detected" }, { "type": "data_exfiltration", "severity": "high", "detail": "API key exposed" } ], "score": 15, "latency_ms": 23 }
SDK Example
const result = await ac.scanner.scan({ output: agentResponse, context: 'my-chatbot' }); if (!result.safe) { console.log('Threats found:', result.threats); // Block or sanitize the response }
💡 Combine Agent Scanner with Agent Firewall (AC-8) for a complete defense-in-depth strategy. Scanner catches output threats; Firewall catches input threats.