CAS Framework is an evaluation platform for enterprise agentic AI. Score your agents across three dimensions — Compliance, Policy Adherence, and Agentic Patterns — and visualise the execution DAG with per-node CAS scores in real time.
Every agent in your fleet is evaluated across three independent dimensions. Some signatures are defaults — applied to every agent, regardless of type. Others are assigned per agent-class to match its specific responsibilities.
Ensures every agent output adheres to data governance, privacy regulation, and organisational policy boundaries. Catches PII leakage, data residency violations, and sensitive context egress before it leaves the execution boundary.
Verifies agents follow your organisation's defined behavioural rules — which MCP tools are permitted, what state mutations are allowed, which output topics are restricted. Scored against CISO-defined DSPy guardrails.
Evaluates whether the agent followed its declared execution pattern — supervisor → specialist routing, loop bounds, tool call ordering, delegation protocols. Different agent types have different expected patterns.
Your OTel trace spans already contain the full execution graph. CAS Framework reconstructs that as a visual DAG — and overlays per-node CAS scores, signature verdicts, and violation details on every agent and tool in the run.
requests library detected. Not in allowed_libraries. Network egress attempt flagged.user_permissions. Policy TOOL_SAFETY_004.
Each "Zero" eliminates a category of blockers — from security reviews to developer overhead. Together they mean your platform, security, and AI teams can all say yes at the same time.
Raw agent conversations, prompts, and PII evaluated inside your VPC. Only CAS scores cross the network boundary.
Platform Engineering owns the deployment. AI Engineers add one label. That's the entire integration surface.
helm install deploys the Mutating Webhook, Eval Engine, and Control Plane to your cluster.cas-framework.ai/inject: "enabled" — the entire integration commitment from an AI engineering team.When your agent fleet adopts a new tool dynamically, evaluation coverage follows in milliseconds — not sprints.
dspy.Signature class, ready for evaluation.OpenTelemetry is the universal observability standard. CAS Framework speaks it natively — so any OTel-emitting agent is automatically supported.
import openlit; openlit.init(endpoint="cas-sidecar:4318") — two lines covers all three evaluation dimensions on any Python agent.Every compliance cycle traditionally requires engineering tickets, DSL rewrites, PRs reviewed, staging tested, production windows coordinated. By the time a new threat is addressed, your fleet has already been running unprotected for two weeks.
CAS Framework's Dynamic Signature Sync Engine closes that window to seconds. A policy author writes in plain language, an internal LLM compiles it to a validated dspy.Signature, and a gRPC/SSE persistent connection propagates it globally — without restarting a single Kubernetes pod.
Architecture Decision Records were always manual — engineers documenting why a system behaved the way it did. For an agent fleet making thousands of routing decisions per second, that's impossible without automation.
CAS Framework generates structured ADRs from every evaluation. When an agent chooses one MCP over another, routes to a specialist, or gets blocked — the reasoning, score, applied policy, and recommendation are captured as an immutable record.
requests for an external network call during code sandbox execution.requests is in forbidden_libraries for this agent's policyallow_network_egress: falseAggregate CAS scores, per-agent drift detection, 90-day risk forecasts, and cost shield metrics — from real-time operational data to strategic leadership dashboards.
| Agent | CAS | Risk | Top Violation | Recommended Action |
|---|---|---|---|---|
| DataAnalysisAgent | 0.74 | MEDIUM | Code Sandbox: forbidden library | Review agent prompt + sandbox policy |
| Slack_Notification | 0.51 | HIGH | Policy: state mutation violation | Immediate review. Enable shadow mode. |
| BillingAgent | 0.77 | MEDIUM | A2A delegation to unknown target | Update allowed specialist targets |
Operational (live) · Tactical (weekly trends) · Strategic (90-day AI risk posture projections)
OpenTelemetry is the universal language. Any agent that speaks it gets full evaluation coverage — compliance, policy adherence, and agentic pattern scoring — automatically.
| Capability | CAS Framework | Traditional Observability / APM |
|---|---|---|
| Evaluation Model | 3-dimension CAS: Compliance, Policy, Agentic Patterns | Metrics and traces only — no semantic evaluation |
| Agent DAG Visualisation | OTel trace → DAG reconstruction with per-node scores | Flat trace waterfalls, no agent topology |
| Data Privacy | Zero-Egress — local eval, offline NLP redaction | Raw prompts and responses egressed to vendor cloud |
| Default Signatures | Pre-built DSPy signatures for ADK patterns, OOTB | No semantic evaluation capability |
| Policy Updates | Dynamic gRPC/SSE push — <5ms, zero pod restarts | Full CI/CD redeploy, 2–4 week cycle |
| Evaluation Cost | LLM Cascade — 90% on fast/cheap local models | Frontier API per evaluation, unbounded cost |
| Compliance ADRs | Auto-generated per violation — structured, immutable | Manual documentation if it exists at all |
| Agent Framework Support | OTel-native — ADK, LangGraph, CrewAI, any agent | Framework-specific, proprietary SDK per vendor |
| Developer Overhead | One K8s label — no SDK, no PR, no release | SDK install and maintenance across every repo |
CAS Framework deploys in 5 minutes. Your first DAG with per-node CAS scores appears immediately after.