Services

Agent Runtime Assurance

Independent, evidence-backed assurance for agentic AI runtimes. ConstantX measures whether your agent completes legitimate work, your runtime contains adversarial requests, and produces an evidence trail your team can verify later.

Engagement Tiers

Agent Runtime Assurance

Point-in-time assurance package for one target runtime

Target-scoped threat model: actors, trust boundaries, target enforcement points, and deployment assumptions
Adversarial and positive-path scenario suites derived from the target's actual authority boundaries
Full traceability: threat → scenario → run artifact → verdict
Decision Coverage report with valid_commit, bounded_failure, and undefined_behavior rates
Threat containment, positive-path success, undefined behavior, and Wilson 95% confidence intervals
Remediation backlog and retest guidance for observed gaps
Hash-bound evidence bundle with raw traces, protocol signals, verdicts, manifest, and verifier instructions
Delivered in 1–3 weeks

Retest Track

Repeatable assurance after model, policy, or runtime changes

Everything in the point-in-time assurance engagement
Re-run after model snapshot, policy, tool, prompt, or runtime changes
Drift comparison against previous manifest-bound engagements
Updated evidence bundle and retest notes for fixed findings
Priority scheduling for production-bound deployments

How It Works

Threat Model. Map the real actors, authority boundaries, tools, memory, network paths, and target enforcement points for the assessed deployment.
Scenario Design. Build adversarial and positive-path suites from the threat model, then validate them before the engagement runner consumes them.
Execution. Run the scenario suite through the target runtime under the declared deployment profile and signal contract.
Reduction. Deterministic verdict assignment: every run classified as valid_commit, bounded_failure, or undefined_behavior.
Reporting. Separate threat containment, positive-path success, undefined behavior, confidence intervals, and remediation backlog.
Delivery. Verifiable artifact package with manifest, hash sidecar, raw run artifacts, and report.

Evidence Standard

Every ConstantX assurance engagement produces:

Decision Coverage metric with 95% Wilson score confidence intervals
Separate accounting for legitimate task success, contained adversarial behavior, and undefined behavior
Threat traceability from threat model entry to scenario, raw artifact, and empirical verdict
Manifest and SHA-256 hash sidecar binding inputs, runtime closure, and outputs
Validity window tied to the pinned target, model, suite set, and evidence hashes

Reports can map findings to NIST AI RMF, OWASP ASI, MITRE ATLAS, or your internal control language. The evidence is deployment-specific: your target runtime, model snapshot, tool configuration, policy set, and scenario suite, not a generic platform summary.

Get in touch

Tell us what your agent can do, what it can reach, and what would hurt if it crossed the wrong boundary.

[email protected]