Posture & Compliance

Designed for networks where the cloud is not an option.

Cascadia is an on-prem runtime. Prompt data, model weights, intermediate activations, and generated tokens never leave your LAN. Integrations are minimal by design — your SIEM, your identity provider, your audit trail.

HIPAA-ready
GDPR by design
SOC 2 alignment
ITAR-compatible

What crosses your network boundary ? Nothing.

Cascadia opens no outbound connections at runtime. The full inference loop — prompt, context, model weights, intermediate hidden states, generated tokens — lives on endpoints you own and operate.

Review the technical architecture
Stays on your network
◆  Prompts & completionsend-user input
◆  Context windowschat / RAG
◆  KV cachesstateful per-request
◆  Model weightsINT4 shards
◆  Intermediate activationshidden states over TCP
◆  Generated tokensstream to end user
◆  Audit logsOTLP · your SIEM
◆  Session statecoordinator memory
Crosses your network boundary
Nothing.
Zero outbound connections at runtime.
No telemetryNo phone-homeNo model-provider callsNo update checksNo crash reports
Verify it yourself. Inspect network traffic on any worker during inference, or block all outbound connections at your firewall. Neither produces any Cascadia-related traffic.

Four properties, non-negotiable.

These aren't configuration toggles. They're the shape of the system itself — the product of choosing a distributed on-prem runtime over a wrapped cloud API.

  1. Prompts and completions are processed on endpoints you own and operate. Cascadia opens no outbound connections at runtime. Full data locality as a property of the architecture, not a configuration.

  2. Deployable in offline environments. Shards, coordinator, and workers ship as a single artifact with no external dependencies at runtime. No outbound DNS, no update servers, no telemetry endpoints.

  3. Weights stay on your fleet. The export pipeline runs once on your build machine before a shard ever touches a worker. No third-party API, no hosted model provider, no licensed inference endpoint between you and the output.

  4. Structured logs for every request, every pipeline stage, every token generated. OTLP-compatible spans pipe directly into your existing observability stack. Replay, trace, and account for every inference call.

Four industries where cloud inference is a liability.

Each case has the same driver: the work requires AI, the data can't leave the network, and third-party inference creates more risk than it removes. Cascadia's security posture isn't a feature — it's the use case.

Book a demo
Financial services

Draft deal docs without triggering a data egress event.

Generation runs on-LAN. Prompts never cross the firm's perimeter, so no egress event is recorded. Audit logs stay in your SIEM under the same retention policy as email.

Legal

Run contract review without risking privilege.

Nothing leaves the firm's network. Privileged documents stay under the same access and retention controls as the DMS. No third-party logging, no cross-client exposure.

Healthcare

Summarize clinical notes without a new BAA.

No third party, no BAA. PHI never leaves the hospital network. Existing HIPAA audit trails extend to Cascadia's per-request logs without a vendor-review cycle.

Public sector

Operate where cloud inference isn't permitted at all.

Air-gap native. No update servers, no telemetry, no outbound DNS. Deploys as a single artifact with nothing to disable and no external dependency to contain.

What your infrastructure team needs to know.

Cascadia runs on what you already have. Stock drivers, open protocols, no kernel work. Bring your own identity and observability — we don't replace them, we integrate.

OS
Windows 11
23H2 or later
Runtime
OpenVINO 2026.1
published bits
Toolchain
Rust 1.95+ stable
MSRV 1.85 · edition 2024
Silicon
Intel Core Ultra
Lunar Lake · Panther Lake
Drivers
Stock Intel
no kernel modules
Network
TCP/IP · LAN
no specialized fabric
Outbound
None required
air-gap compatible
Integrations
Your IdP + SIEM
OIDC · OTLP

Frequently asked questions

Inspect network traffic on any worker node during inference. Block all outbound connections at your firewall. Neither should produce any Cascadia-related traffic — the runtime opens no outbound sockets beyond the coordinator's LAN-local TCP connections to peer workers. Our preprint includes packet captures from a production fleet; pilot environments replay the same validation on your network.
Structured spans per request: request ID, prompt token count, model version, shard hash, worker IDs traversed, per-stage latency, total generation time. No prompt contents, no completions, no PHI/PII — just the metadata your SIEM needs for retention and compliance reporting. OTLP-compatible; pipe it wherever you already aggregate logs.
Worker processes are isolated per node; a compromised worker affects only its own shard and its local state. The coordinator will isolate failed workers automatically once failover ships (tracked in the limits section above); today, an affected request aborts. Because Cascadia ships with no telemetry or call-home, incident scope stays local to your network.
Each compiled shard ships with a SHA-256 digest in its manifest. Workers verify the digest before loading; the coordinator verifies the digest before routing to a worker. Sigstore-compatible supply-chain signing (build attestation, transparency log) is on the roadmap for environments that need verifiable provenance beyond hash-match.
In the current preview, upgrades are all-or-nothing: stop the fleet, deploy new shards and runtime, restart. The coordinator doesn't negotiate mixed versions yet. Versioned coordinator negotiation for rolling upgrades is tracked in the limits section — until it ships, treat Cascadia upgrades the way you'd treat a database migration: staged, on off-hours, with a rollback artifact prepared.