How It Works

How We Evaluate Risk for AI Agents

Every AI agent carries a unique risk profile shaped by what it can do, where it operates, and how it fails. We assess risk at the agent level — not the company level — because that's where the exposure actually lives.

Why This Matters

Two agents, same model, wildly different risk.

An agent that drafts internal summaries and one that executes financial transactions may run on the same foundation model — but their risk profiles are worlds apart. What matters isn't the model; it's the instructions, tools, data access, and guardrails wrapped around it. Instructions become code, and code carries liability. That's why we underwrite at the agent level.

Autonomy & Action SurfaceExposure & ContextControl StrengthAuditabilityFailure Impact

Accounts Payable Agent

Processes invoices, approves payments, manages vendor accounts

Autonomy & Action SurfaceExposure & ContextControl StrengthAuditabilityFailure Impact

Logistics Agent

Routes shipments, dispatches carriers

Risk Dimensions

What we measure.

Every AI agent carries a unique risk profile. We evaluate five critical dimensions to build a complete picture of how an agent behaves, what it can access, and how it fails. Our SDK plugs directly into agent platforms to inspect how each agent is built, configured, and constrained.

Build Quality

Then we inspect for good build practices.

Beyond risk profiling, we check whether the agent follows the engineering practices that reduce the likelihood and severity of failures. To be considered eligible for coverage, a technical quality standard must first be met.

Permissioning

Least-privilege access controls scoped to each agent’s role — no blanket admin tokens, no over-provisioned service accounts.

Tool Safety

Validated tool schemas, sandboxed execution, and input/output filtering for every external call the agent makes.

Human-in-the-Loop

Defined escalation paths and approval gates for high-stakes actions — so humans stay in the loop where it matters.

Data Minimisation

Agents access only the data they need, for as long as they need it. No persistent caches of sensitive information.

Auditability

Structured logging of every decision, tool call, and state change — producing a defensible trail for incident review.

Model & Prompt Governance

Version-pinned models, reviewed system prompts, and change-management controls that prevent silent drift.

Output

Every assessment produces four ratings.

These are the structured outputs we deliver to clients and insurance partners — a shared, precise view of agent risk that both technical teams and underwriters can act on.

01

Controls Score

A quantified score for how well the agent is built and operated — permissioning, guardrails, logging, and governance.

82
02

Risk Register

A structured register of risks tailored to the agent — what can go wrong, how likely it is, and what controls are in place.

03

Loss Model

What “bad events” look like in dollars, time, and legal exposure — modeled from the agent’s actual blast radius.

04

Insurability Recommendation

A dynamic recommendation that updates as the agent changes — covering eligibility, pricing, and conditions.

ConditionalInsurablePreferred

Continuous Evaluation

A living risk score, not a one-time assessment.

Agents change — new tools get added, prompts get rewritten, permissions expand. A point-in-time audit can't keep up. Our SDK continuously monitors how agents are configured and behaves, updating the risk score as the agent evolves. If risk drifts outside the insured envelope, we flag it before it becomes a claim.

Encrypted logs
14:32:05██████████ ████████████
14:32:06██ model=opus-4-5 ██████
14:32:07████████ ██████████████
14:32:08████ ██████████ ██████
14:32:09████████ ████████████████
14:32:10████ ██████ ██████████
14:32:11+ tool_added: stripe_refund
14:32:12████████ ████████████████
14:32:13████ ██████████ ██████
14:32:14model_changeopus-4-6
14:32:15██████████ ████████████
14:32:16████ ██████████ ██████
14:32:17████████ ████████████████
14:32:18████ ██████ ██████████
Risk Register
Controls Score
Loss Model
Insurability Rec.

Agent Risk Profile

Get Evaluated

Want to get your agent evaluated?

Platforms & agent-building companies

Embed insurance-grade risk evaluation into your platform so every agent ships with a clear risk profile and insurability signal.

Enterprises deploying agents

Get a per-agent risk assessment tied to actual configuration — not a generic AI policy — so you can deploy with confidence and coverage.