A complete technical reference for the CAS Framework system — from OTel span ingestion and DAG reconstruction through DSPy evaluation, Zero-Egress data flow, and the Dynamic Signature Sync Engine.
The complete system showing the Customer VPC boundary, all components running inside it, what crosses the boundary, and the SaaS Control Plane. The critical invariant: zero raw data exits the VPC — only CAS scores, violation flags, and trace topology egress over the network.
The complete sequence from your agent emitting an OTel span through PII stripping, DAG reconstruction, LLM Cascade routing, Presidio offline redaction, and final CAS score computation — showing every component handoff and what data travels between them.
How CAS Framework converts a flat list of OTel spans into a visualised execution graph with CAS scores on every node. The parent-child span relationships already encode the full execution topology — we reconstruct it, assign evaluation dimensions per node, select the right DSPy signature, and score each node independently.
How a natural language policy rule written in the SaaS console gets compiled into a validated DSPy Signature, backtested in shadow mode against 30 days of history, and propagated globally to every running Sidecar in under 5 milliseconds — without restarting a single Kubernetes pod.
Every technology in the CAS Framework stack, why it was chosen, and what it owns in the system.
| Component | Technology | Owns | Why This Choice |
|---|---|---|---|
| Sidecar Injection | Kubernetes Mutating Webhook | Auto-injects OTel Collector + Eval sidecar into labelled Pods. Zero developer action beyond one label. | The only K8s-native way to achieve zero-code-change instrumentation across an entire cluster without SDK proliferation. |
| Span Collection | OTel Collector (CNCF) | Receives spans from agent process via gRPC, strips PII inline using transform processors, buffers and batches before eval engine. | CNCF standard — works with every major agent framework natively. No vendor lock-in. Transform processors enable inline PII removal before data moves. |
| Evaluation Engine | FastAPI + DSPy | Reconstructs DAG from span parent/child IDs. Routes spans to the correct DSPy signature per node type. Computes per-node and global CAS scores. | FastAPI for high-throughput async span processing. DSPy for structured LLM evaluation — signatures are strongly typed, reproducible, and version-controlled. |
| LLM Cascade Router | Custom Python + DSPy | Routes each evaluation to local (Llama3-8B, 90%) or frontier (Gemini Pro, 10%) model based on complexity scoring. Achieves 90% cost reduction. | Frontier models are expensive at agent fleet scale. Most evaluations are structurally straightforward — local 8B models handle them accurately at a fraction of the cost. |
| NLP Redaction | Microsoft Presidio | Offline async NLP pass replacing entities (names, SSNs, card numbers) with typed placeholders. Runs as Hatchet background worker. | Presidio is purpose-built for PII recognition and de-identification. Running offline (Hatchet worker) means it never blocks the evaluation path — latency stays low. |
| Job Queue | Hatchet | Manages async Presidio redaction jobs, retry logic, and priority queuing for heavy NLP workloads. | Hatchet provides durable, observable background job execution with built-in retry and dead-letter queues — essential for heavy NLP workloads at scale. |
| Analytics Storage | ClickHouse | Stores egress-safe payloads (scores, flags, topology). Serves DAG queries, per-agent CAS aggregations, 30-day backtesting, and 90-day projections. | ClickHouse achieves sub-millisecond aggregation on time-series data at hundreds of millions of spans/day. PostgreSQL cannot serve DAG queries at agent fleet scale without degrading. |
| Policy Sync | gRPC / SSE persistent connections | Maintains long-lived connections from SaaS to all Sidecars. Pushes compiled DSPy signatures globally in under 5ms without pod restarts. | Long-lived gRPC streams eliminate polling overhead and achieve near-instant propagation. SSE fallback for environments where gRPC is restricted. |
| Meta-Compiler | Internal LLM (hosted in SaaS) | Translates natural language policy rules into validated DSPy Signature JSON DSL. Validates output against OTel span schema before propagation. | Natural language input dramatically reduces the policy authoring barrier — compliance teams can write rules without DSPy knowledge. Schema validation ensures no malformed signatures reach Sidecars. |
| Dashboard | Next.js | DAG visualiser with per-node CAS scores, fleet health KPIs, agent risk posture, 30-day trend charts, leadership projections, ADR archive. | Server-rendered for fast initial load. React component model enables the interactive DAG canvas and real-time score updates via WebSocket. |
| Ingest API | Go / Rust | High-throughput ingestion of zero-egress payloads from Sidecars. Handles burst traffic from large agent fleets without dropping spans. | Go/Rust for the ingestion hot path because Python cannot sustain the throughput required for large enterprise agent fleets at sub-millisecond latency targets. |
The key trade-offs made during design — what alternatives were considered and why we chose what we did. These are the decisions that define the system's character.
One Helm chart. One Kubernetes label. Your agent fleet DAG with per-node CAS scores appears automatically.