How It Works

How We Evaluate Risk for AI Agents

Every AI agent carries a unique risk profile shaped by what it can do, where it operates, and how it fails. We assess risk at the agent level — not the company level — because that's where the exposure actually lives.

Policy Structure

Scheduling Agents on the Policy

Each evaluated agent is scheduled onto the organisation's base policy. If a failure occurs, claims reference the specific agent responsible, allowing insurers to clearly identify the source of the event.

Why This Matters

Two agents, same model, wildly different risk.

An agent that drafts internal summaries and one that executes financial transactions may run on the same foundation model — but their risk profiles are worlds apart. Risk depends on how the system is deployed: the instructions it receives, the policies it interprets, the tools it can invoke, the users interacting with it, and the volume of actions it performs. That's why we underwrite at the agent level.

MisrepresentationOperational FailureFinancial ErrorData ExposureRegulatory Breach

Accounts Payable Agent

Processes invoices, approves payments, manages vendor accounts

MisrepresentationOperational FailureFinancial ErrorData ExposureRegulatory Breach

Logistics Agent

Routes shipments, dispatches carriers

Controls Evaluation and Score

First, we assess eligibility and inspect for good build practices.

Beyond risk profiling, we check whether the agent follows the engineering practices that reduce the likelihood and severity of failures. Examples of the engineering controls we evaluate include:

Permissioning

Least-privilege access controls scoped to each agent’s role — no blanket admin tokens, no over-provisioned service accounts.

Tool Safety

Validated tool schemas, sandboxed execution, and input/output filtering for every external call the agent makes.

Human-in-the-Loop for Risk Actions

Defined escalation paths and approval gates for high-stakes actions — so humans stay in the loop where it matters.

Data Minimisation

Agents access only the data they need, for as long as they need it. No persistent caches of sensitive information.

Auditability

Structured logging of every decision, tool call, and state change — producing a defensible trail for incident review.

Model & Prompt Governance

Version-pinned models, reviewed system prompts, and change-management controls that prevent silent drift.

Risk Dimensions

What we measure.

Every AI agent carries a unique risk profile. We evaluate five critical dimensions to build a complete picture of how an agent behaves, what it can access, and how it fails. Our SDK plugs directly into agent platforms to inspect how each agent is built, configured, and constrained.

Loss Model

This allows us to price the risk.

We model what “bad events” look like in dollars, time, and legal exposure — calibrated from the agent's actual blast radius, not generic industry benchmarks.

$0Max LossEstimated Loss Severity

Continuous Evaluation

A living risk score, not a one-time assessment.

Agents change — new tools get added, prompts get rewritten, permissions expand. A point-in-time audit can't keep up. Our SDK monitors configuration changes, permissions, tools, and deployment parameters so the risk profile can be updated as agents evolve. If risk drifts outside the insured envelope, we flag it before it becomes a claim.

Encrypted logs
14:32:05██████████ ████████████
14:32:06██ model=opus-4-5 ██████
14:32:07████████ ██████████████
14:32:08████ ██████████ ██████
14:32:09████████ ████████████████
14:32:10████ ██████ ██████████
14:32:11+ tool_added: stripe_refund
14:32:12████████ ████████████████
14:32:13████ ██████████ ██████
14:32:14model_changeopus-4-6
14:32:15██████████ ████████████
14:32:16████ ██████████ ██████
14:32:17████████ ████████████████
14:32:18████ ██████ ██████████
AI Model Change
Updated Context Window
New Tool Added
Permissions Updated

Agent Risk Profile

Get Evaluated

Want to get your agent evaluated?

Platforms & agent-building companies

Embed insurance-grade risk evaluation into your platform so every agent ships with a clear risk profile and insurability signal.

Enterprises deploying agents

Get a per-agent risk assessment tied to actual configuration — not a generic AI policy — so you can deploy with confidence and coverage.