Behavioral Monitoring for AI Agents: The Compliance Layer

An AI agent that only gets checked at the moment it is deployed is like a new employee who goes through a background check on day one but never again. You verified who they were when they started. You have no ongoing picture of whether they are doing what they said they would do, or whether something has gone wrong since.

This is the gap that behavioural monitoring fills for AI agents. It is not about distrust. It is about maintaining the continuous visibility that any well-run operation needs when autonomous systems are acting on your behalf.

What Behavioural Monitoring Actually Means

Behavioural monitoring for AI agents means capturing a structured record of what each agent does in production — every API call, every database query, every transaction submission, every authentication event — and using that stream of events to compute an ongoing picture of whether the agent is operating within expected parameters.

This is different from logging. Logging is retrospective: you look at logs after something goes wrong to understand what happened. Behavioural monitoring is continuous: you are computing risk indicators in real time, updating a risk score with every event, and triggering alerts or automated responses when the score crosses defined thresholds.

The distinction matters enormously for compliance. Under the EU AI Act, operators of high-risk AI systems are required to implement "logging capabilities" that allow for "post-hoc monitoring." But Article 12 goes further: it requires that logs allow operators to detect "situations in which the AI system may not function as intended." You cannot detect a malfunction in real time from logs you review weekly. You need a continuous monitoring layer.

For a detailed look at how Kakunin's event ingestion works technically, the event ingestion documentation covers the API design, event types, and risk scoring model.

The Eight Core Event Types

In a well-instrumented AI agent deployment, there are eight categories of events that matter most for behavioural risk assessment.

Transaction events capture any action the agent takes that moves value or resources — a trade submission, a payment initiation, a resource allocation. These are your highest-stakes events. Anomalies here — volume spikes, unusual counterparties, off-hours timing — are the earliest signals of a malfunctioning or compromised agent.

Data access events record every time the agent reads from a database, calls an external data source, or queries a customer record. An agent that suddenly begins accessing records far outside its normal data scope is exhibiting a pattern that might indicate prompt injection, misconfiguration, or credential theft.

API call events track outbound calls to external services. An agent that begins calling unusual endpoints, or making requests at rates that exceed its normal operational pattern, is worth investigating. This category is particularly relevant for supply chain risk: an agent calling a newly-registered external API is a red flag.

Authentication events capture every time the agent authenticates to a system, including failures. A spike in authentication failures often precedes a security incident. An agent that suddenly begins attempting to authenticate to systems it has never accessed before is exhibiting anomalous behaviour.

Configuration change events record any time the agent's own configuration is modified. This is a meta-level signal: who or what is changing the agent's parameters, and do those changes correspond to an authorised management action?

Communication events track messages the agent sends to other agents or external systems. Multi-agent systems create particular monitoring challenges because agents can communicate with each other. A communication channel between two agents that was never supposed to exist is a significant anomaly.

Resource usage events monitor compute, memory, and storage consumption. An agent entering an infinite loop, or being fed an adversarial prompt designed to consume maximum compute, will show up in resource usage before it shows up in business metrics.

Decision events record the agent's outputs and reasoning where available. In high-stakes domains — credit decisions, medical recommendations, legal analysis — you want a record of not just what the agent decided, but the inputs it used to reach that decision.

How Risk Scoring Works: The 30-Day Rolling Window

Individual events are not particularly informative on their own. A single authentication failure might be a network glitch. An unusual transaction might be a legitimate edge case. Risk scoring becomes meaningful when you look at patterns across a time window.

Kakunin's risk scoring model uses a 30-day rolling window. Every event that arrives for a given agent is scored against the agent's historical behaviour over the previous 30 days. The score reflects both the absolute nature of the event and how unusual it is relative to the agent's established baseline.

The 30-day window is calibrated to balance two competing needs. Too short a window — say, 24 hours — produces too many false positives: every time an agent handles an unusual but legitimate workload, its risk score would spike. Too long a window — six months — makes the system slow to detect genuine behavioural drift.

Thirty days typically spans two or three business cycles for most regulated operations, which means the baseline reflects realistic operational variation rather than artificial stability. An agent that legitimately handles higher volume at month-end (common in financial services) will have a baseline that accounts for that variation. A sudden mid-month volume spike that has no historical precedent will register as anomalous even though the absolute volume is not exceptional.

The risk score sits between 0 and 1. There are three action thresholds:

0.3 and below: Normal operation. No intervention required.
0.3 to 0.75: Medium risk. Increased monitoring frequency; alerts generated but agent continues operating.
0.75 to 0.85: Pre-revocation warning. The agent's score is approaching the auto-revocation threshold. Your team gets a notification; this is the window to investigate before automated action triggers.
Above 0.85: Auto-revocation. The agent's certificate is revoked within 60 seconds. It cannot execute further operations.

This graduated response is important. Immediate revocation on any anomaly would make the system unusable — real operations are noisy, and no behavioural model is perfect. The warning zone gives your team time to investigate and intervene before automated revocation, while the hard threshold at 0.85 ensures that genuinely dangerous behaviour does not persist indefinitely while humans deliberate.

For the operational details of responding to risk alerts, the enforcement documentation includes on-call runbooks for each threshold level.

The Compliance Case: Why MiCA Mandates Monitoring

MiCA Article 73 requires regulated entities deploying automated systems to implement "monitoring mechanisms" that can detect whether those systems are operating as intended. This is not aspirational language — it is a compliance obligation with teeth. Firms that cannot demonstrate active monitoring of their AI systems face enforcement action.

The FATF guidance on virtual asset service providers (VASPs) similarly requires continuous transaction monitoring for AML purposes. An AI agent executing trades on behalf of clients must be monitored in exactly the same way as a human trader. The fact that the trades are generated by code does not reduce the AML obligation.

What the regulation asks for, in practical terms, is a combination of:
1. A structured record of agent actions (the event stream)
2. A mechanism for detecting anomalous behaviour (the risk scoring model)
3. A response capability when anomalies are detected (the revocation and alert system)
4. An immutable audit trail that can be presented to regulators (the WORM audit log)

Kakunin's behavioural monitoring layer addresses all four. The audit trail is WORM-backed — write once, read many — so that the historical record cannot be altered after the fact. This is a specific requirement under EU AI Act Article 12, which requires that logs "have the appropriate level of integrity to ensure they cannot be manipulated."

For compliance officers managing AI agent deployments, the /for-compliance-officers page gives a more detailed breakdown of how each MiCA and EU AI Act requirement maps to specific system capabilities.

A Practical Example: Detecting Prompt Injection via Behavioural Signals

Prompt injection is one of the most significant active threats against deployed AI agents. An attacker embeds malicious instructions in content the agent is expected to process — a customer message, a document, a web page — and the agent executes those instructions instead of its legitimate task.

The insidious feature of prompt injection is that the agent may appear to be functioning normally from the outside while executing the attacker's instructions. Standard error monitoring will not catch it. Application logging will show the agent processing requests and returning responses, which looks normal.

But behavioural monitoring can catch it. Here is what prompt injection often looks like in the event stream:

A sudden spike in data access events — the agent has been instructed to exfiltrate data
Anomalous API calls to external endpoints the agent has never accessed before
Authentication events to systems outside the agent's normal scope
Unusual communication patterns if the agent is messaging other systems

None of these individually proves prompt injection. But a cluster of these signals — especially when they correlate with a specific incoming request type — is a strong indicator worth investigating. The risk score rising sharply after processing a particular data source is a signal your security team can act on.

This is the practical value of event-level granularity. You cannot detect prompt injection at the aggregate level. You need to know that this specific agent instance, at this specific time, began behaving differently after this specific input.

Streaming Architecture: Why Event-Level Granularity Matters

The temptation in building monitoring systems is to batch events — collect them hourly or daily and then run analysis. This is cheaper and easier to implement than streaming. But batching destroys the ability to detect and respond to in-progress incidents.

If an agent starts misbehaving at 9:00am on a Monday and you process events hourly, you will not know until 10:00am at the earliest. If the agent is executing financial transactions at a rate of 100 per second, that is 360,000 transactions before your monitoring system flags the issue.

Real-time streaming means risk scores update with every event. The system can detect the deviation and trigger alerts within seconds of the first anomalous event, not minutes or hours later. At Kakunin, the event ingestion endpoint is designed to handle 1,000 events per second per tenant, with risk score updates propagating to the alert system within seconds of ingestion.

The event streaming documentation includes the specific latency guarantees and the event payload schema. For DevOps teams managing large agent fleets, the operational documentation covers the alert routing and on-call integration patterns.

What Good Looks Like: A Baseline Healthy Agent

It is worth describing what a well-monitored, well-behaving agent looks like from a monitoring perspective. This gives you a reference point for what you are calibrating your thresholds against.

A healthy trading agent, for example, might show the following behavioural fingerprint:

Transaction volumes that follow a daily pattern with a consistent baseline and predictable peak periods
Data access events concentrated in a specific set of market data sources and customer records, with a stable distribution across sources
API calls within a defined set of external endpoints, at rates that correlate with transaction volumes
Authentication events only to expected systems, with a low and consistent failure rate
Risk score staying below 0.3 consistently, with occasional brief excursions to 0.4–0.5 during legitimate high-volume periods

Any sustained deviation from this fingerprint is worth investigating. The power of having 30 days of baseline data is that you can distinguish "the agent is handling an unusual but legitimate situation" (the score rises briefly and returns to baseline) from "the agent has fundamentally changed its behaviour" (the score rises and stays elevated or continues climbing).

Getting Monitoring Into Production

Instrumenting an agent for behavioural monitoring involves adding event emission calls at the points in your code that correspond to the eight event categories. For most agent frameworks, this is a relatively small amount of additional code — you are wrapping existing function calls with event emitters.

The quickstart guide for AI agents includes specific integration examples for common frameworks. The event emission is designed to be non-blocking: your agent calls the Kakunin event endpoint asynchronously and continues its primary task without waiting for the monitoring system to process the event. The monitoring overhead adds under 10ms to the critical path.

Once events are flowing, the risk scoring engine begins building the agent's behavioural baseline. For the first few days, the risk scores may fluctuate more than they will once the baseline stabilises. Plan for a 7–10 day warm-up period before you act on pre-warning thresholds.

The ENISA threat landscape for AI systems provides an authoritative reference for the categories of threats that behavioural monitoring is designed to detect. It is worth reviewing this document when designing your event taxonomy and calibrating your risk thresholds.

To understand why these behavioral signals matter, it helps to place them in the broader threat landscape for 2026, where misuse often begins with small deviations before becoming material incidents.

From Monitoring to Accountability

Behavioural monitoring is not just a technical control. It is a compliance posture. When a regulator asks how you ensure your AI agents are operating within authorised parameters, the answer "we have real-time behavioural monitoring with automatic revocation when anomalies exceed defined thresholds" is meaningfully different from "we review our logs periodically."

The former is a managed process with defined thresholds, automated responses, and an audit trail. The latter is a manual process with undefined review cadence and no guarantee of timely detection.

As AI agents take on increasingly significant operational roles in regulated industries, the distinction between these two postures will become a key differentiator in regulatory audits and enterprise procurement processes. The organisations that build monitoring infrastructure now are building the evidence base they will need to demonstrate responsible deployment.

---

Kakunin's behavioural monitoring layer streams agent events in real time and updates risk scores continuously across a 30-day rolling window. Auto-revocation triggers within 60 seconds when risk exceeds 0.85. See pricing and plans or read how DevOps teams use Kakunin.