Services
Agent Runtime Assurance
Independent, evidence-backed assurance for agentic AI runtimes. ConstantX measures whether your agent completes legitimate work, your runtime contains adversarial requests, and produces an evidence trail your team can verify later.
Engagement Tiers
- Target-scoped threat model: actors, trust boundaries, target enforcement points, and deployment assumptions
- Adversarial and positive-path scenario suites derived from the target's actual authority boundaries
- Full traceability: threat → scenario → run artifact → verdict
- Decision Coverage report with valid_commit, bounded_failure, and undefined_behavior rates
- Threat containment, positive-path success, undefined behavior, and Wilson 95% confidence intervals
- Remediation backlog and retest guidance for observed gaps
- Hash-bound evidence bundle with raw traces, protocol signals, verdicts, manifest, and verifier instructions
- Delivered in 1–3 weeks
- Everything in the point-in-time assurance engagement
- Re-run after model snapshot, policy, tool, prompt, or runtime changes
- Drift comparison against previous manifest-bound engagements
- Updated evidence bundle and retest notes for fixed findings
- Priority scheduling for production-bound deployments
How It Works
- Threat Model. Map the real actors, authority boundaries, tools, memory, network paths, and target enforcement points for the assessed deployment.
- Scenario Design. Build adversarial and positive-path suites from the threat model, then validate them before the engagement runner consumes them.
- Execution. Run the scenario suite through the target runtime under the declared deployment profile and signal contract.
- Reduction. Deterministic verdict assignment: every run classified as valid_commit, bounded_failure, or undefined_behavior.
- Reporting. Separate threat containment, positive-path success, undefined behavior, confidence intervals, and remediation backlog.
- Delivery. Verifiable artifact package with manifest, hash sidecar, raw run artifacts, and report.
Evidence Standard
Every ConstantX assurance engagement produces:
- Decision Coverage metric with 95% Wilson score confidence intervals
- Separate accounting for legitimate task success, contained adversarial behavior, and undefined behavior
- Threat traceability from threat model entry to scenario, raw artifact, and empirical verdict
- Manifest and SHA-256 hash sidecar binding inputs, runtime closure, and outputs
- Validity window tied to the pinned target, model, suite set, and evidence hashes
Reports can map findings to NIST AI RMF, OWASP ASI, MITRE ATLAS, or your internal control language. The evidence is deployment-specific: your target runtime, model snapshot, tool configuration, policy set, and scenario suite, not a generic platform summary.
Get in touch
Tell us what your agent can do, what it can reach, and what would hurt if it crossed the wrong boundary.
[email protected]