[2026.2.3-beta]
🎉 Third Public Beta Release
The AI Security Gateway is a unified security platform providing real-time monitoring, policy enforcement, and threat detection for Large Language Model (LLM) APIs and Model Context Protocol (MCP) servers. This beta release represents a comprehensive security proxy and monitoring platform for AI infrastructure.
This release introduces our Guardrails Evaluation scanning tool, improved relationship visualisation graphs on the dashboard, and enhancements to the Canary Token detection feature.
🕵️ Canary Token Detection
Canary Token Injection is a security feature that helps detect when data from one user or session is accidentally exposed to another. Think of it as a tripwire : an early warning system that alerts you to potential data leakage in your AI systems.
When proxying requests, the gateway silently injects unique, invisible tokens into each user's conversation. If a token surfaces where it shouldn't, you'll know immediately.
Detection types:
- Cross-User Leakage: A canary from User A appeared in a response to User B, indicating data bleed between users
- Cross-Session Leakage: A canary from Session A appeared in the same user's Session B, indicating session isolation failure
- Provider Memorisation: A canary resurfaced without being present in the current context, suggesting the LLM provider has memorised prior conversation data
- Stale Canary: A canary older than 7 days reappeared, a strong indicator of long-term memorisation by the model provider
🛡️ Guardrails Evaluation
Having guardrails is one thing. Knowing they actually work is another.
Guardrails Evaluation is automated penetration testing for your AI safety controls. It runs a comprehensive suite of security test cases against your endpoints and scores the results against the OWASP LLM Top 10 and NIST AI Risk Management Framework.
What it tests:
- Prompt Injection: Direct injection, goal hijacking, and system prompt extraction
- MCP Tool Poisoning: Malicious tool descriptions, command injection, and protocol exploitation
- Bypass Techniques: Encoding obfuscation, flag manipulation, and filter evasion
- Semantic & Structural Evasion: Skeleton key, roleplay, payload splitting, homoglyph substitution, and multilingual evasion
- Data Exfiltration: Credential theft, PII extraction, and proprietary data leakage
- Multi-Turn Escalation: Crescendo attacks, echo chamber context poisoning, and many-shot overrides
- Harmful Content & Toxicity: Requests to generate violent, self-harm, or weapons content
- Misinformation: Fake news generation and disinformation campaigns
- PII Extraction: Attempts to extract personal data about real individuals
- Resource Exhaustion: Recursive loops, context window flooding, and token bombing
- Benign Controls: Legitimate requests that should pass through (false positive validation)
Key features:
- 80+ built-in test cases across 12 categories, with the ability to add your own custom tests
- Compliance scoring mapped to OWASP LLM Top 10 and NIST AI RMF
- Test any endpoint: works with any API that wraps an LLM, not just direct LLM providers. Import endpoints via curl command paste
- Multi-turn attack simulation: tests that span multiple conversation turns to detect escalation vulnerabilities
- Per-category risk breakdown with pass/fail rates and weighted risk scores
🔗 Relationship Visualisation
A new interactive graph on the dashboard that visualises relationships between proxies, policies, users, and data flows, making it easier to understand how your AI infrastructure is connected at a glance.