KAKUNIN

Autonomous AI Security Guide

Autonomous agents operate with real authority: executing trades, calling APIs, modifying infrastructure, processing payments. Traditional application security was designed for deterministic software controlled by humans. Agents are different — they make decisions at runtime, spawn sub-agents, use external tools, and act without synchronous human approval.

This guide covers the threat model, defensive architecture, and operational controls required to deploy AI agents securely in regulated environments.


Threat Model

What Changes When Software Becomes Autonomous

Traditional application security assumes the application logic is fixed at deploy time. Autonomous agents introduce three new threat vectors:

1. Prompt Injection
An adversarial payload embedded in external data (a document, a search result, a user message) causes the agent to execute unintended actions. Unlike SQL injection, the exploit surface is natural language rather than syntax.

2. Tool Scope Escalation
An agent with access to read_file and execute_bash can be coerced into combining them in ways the operator never intended. Scope policies on the certificate layer constrain what the agent may do, independent of what the LLM decides.

3. Identity Spoofing
Without cryptographic identity, there's no way to verify that the agent performing a transaction is the same agent that was authorised. An attacker who gains container access can impersonate the agent unless identity is bound to a hardware-backed key in KMS/HSM.

Attack Surface Matrix

VectorExploited ByMitigation
Prompt injectionAdversarial content in tool outputsSandboxed tool execution; output validation
Stolen API keyNetwork interception; env var leakReplace API keys with X.509 certificates
Container escapeRuntime vulnerabilityKMS-backed keys; key never in container memory
Rogue sub-agentLLM-orchestrated agent spawningSub-agent certificate scope = subset of parent
Baseline driftGradual objective deviationContinuous behavioral profiling; rolling baseline
Replay attackCaptured signed requestSigned nonces; short-lived certificate validity

Cryptographic Identity

Why API Keys Are Not Enough

API keys are:

X.509 certificates issued by Kakunin solve all of these:

// Register agent — one certificate per agent instance
const agent = await kakunin.agents.create({
  name: 'payment-processor-eu-v3',
  metadata: {
    version: '3.1.0',
    deployment: 'k8s-eu-west-1',
    instance: process.env.POD_NAME,  // per-pod identity
  },
});

const cert = await kakunin.agents.getCertificate(agent.id, {
  validityDays: 365,
  scope: {
    // Scope limits enforce authority at the cryptographic layer
    maxTransactionSize: 10000,        // EUR
    allowedCounterparties: ['stripe', 'revolut'],
    allowedActions: ['charge', 'refund'],
    allowedRegions: ['eu-west-1', 'eu-central-1'],
  },
});

// cert.pem — public certificate (safe to share, embed in requests)
// cert.kmsKeyArn — private key stays in KMS, never leaves

Certificate Trust Chain

Kakunin Root CA (AWS KMS RSA_4096, eu-west-1)
  └── Kakunin Intermediate CA (per-tenant)
        └── Agent Certificate (per-agent-instance)
              ├── Subject: CN=payment-processor-eu-v3
              ├── SAN: agent-id=a_xyz123
              ├── Scope extensions (custom X.509 extensions)
              └── Validity: 365 days

Signing Agent Actions

Every significant action should be signed with the agent's private key before submission:

import { createSign } from 'crypto';

async function signAction(payload: object, kmsKeyArn: string) {
  // KMS signs — private key never leaves HSM
  const message = Buffer.from(JSON.stringify(payload));
  const signature = await kms.sign({
    KeyId: kmsKeyArn,
    Message: message,
    MessageType: 'RAW',
    SigningAlgorithm: 'RSASSA_PKCS1_V1_5_SHA_256',
  });

  return {
    payload,
    signature: Buffer.from(signature.Signature!).toString('base64'),
    certificateFingerprint: computeFingerprint(agentCert),
    timestamp: Date.now(),
    nonce: crypto.randomUUID(),  // prevents replay
  };
}

Downstream systems verify the signature before processing:

async function verifySignedAction(signedAction: SignedAction) {
  // 1. Verify certificate is valid and not revoked
  const cert = await kakunin.certificates.verify(signedAction.certificateFingerprint);
  if (!cert.valid) throw new Error(`Certificate invalid: ${cert.reason}`);

  // 2. Verify signature
  const verifier = createVerify('RSA-SHA256');
  verifier.update(Buffer.from(JSON.stringify(signedAction.payload)));
  const valid = verifier.verify(cert.publicKey, signedAction.signature, 'base64');
  if (!valid) throw new Error('Signature verification failed');

  // 3. Check nonce (prevent replay)
  const seen = await redis.setnx(`nonce:${signedAction.nonce}`, '1', 'EX', 3600);
  if (!seen) throw new Error('Replay detected — nonce already used');

  // 4. Enforce scope
  const action = signedAction.payload;
  if (action.amount > cert.scope.maxTransactionSize) {
    throw new Error(`Action exceeds certificate scope: ${action.amount} > ${cert.scope.maxTransactionSize}`);
  }

  return { verified: true, agentId: cert.agentId };
}

Runtime Enforcement

Scope Policy Architecture

Scope policies are embedded in the certificate as custom X.509 extensions. They cannot be modified without reissuing the certificate (requires Kakunin CA private key in KMS). Three enforcement layers:

Layer 1 — Gateway (API Layer)
Middleware reads certificate scope before routing requests. Requests outside scope are rejected with 403 before reaching business logic.

Layer 2 — Tool Guard (LLM Tool Layer)
Each tool call passes through verify_agent_scope before execution:

import { ToolGuard } from '@kakunin/sdk';

const guard = new ToolGuard({
  apiKey: process.env.KAKUNIN_API_KEY,
  agentId: process.env.AGENT_ID,
  certificatePath: '/var/certs/cert.pem',
  kmsKeyArn: process.env.KMS_KEY_ARN,
});

// Wrap every tool
const tools = {
  charge: guard.wrap('charge', async (params) => {
    // Guard checks: cert valid? amount within scope? counterparty allowed?
    // Throws if any check fails — LLM gets tool error, not unguarded execution
    return await stripe.charges.create(params);
  }),

  refund: guard.wrap('refund', async (params) => {
    return await stripe.refunds.create(params);
  }),
};

Layer 3 — Behavioral Anomaly (Monitoring Layer)
Even scope-compliant actions are checked against behavioral baseline. 100 individually valid €9,900 charges in 10 minutes is compliant per scope, but a 12× deviation from the hourly baseline triggers a pre-revocation warning.

LangChain Integration

from kakunin import ToolGuard
from langchain.tools import tool

guard = ToolGuard(
    api_key=os.environ["KAKUNIN_API_KEY"],
    agent_id=os.environ["AGENT_ID"],
)

@tool
@guard.verify_scope("charge")
def process_payment(amount: float, currency: str, customer_id: str) -> str:
    """Process a customer payment."""
    # Guard verifies: cert valid, amount within scope, anomaly score < threshold
    result = stripe.charge(amount, currency, customer_id)
    return f"Charged {amount} {currency} to {customer_id}: {result.id}"

Behavioral Security

Establishing Baseline

Run the agent in observation mode for 7–14 days before enforcing behavioral limits:

// Week 1: Permissive mode — observe, don't block
const agent = await kakunin.agents.create({
  name: 'payment-processor-eu-v3',
  mode: 'observe',        // Log anomalies, don't block
  anomalyThreshold: 1.0,  // Never block during baseline collection
});

// Week 2: Review collected baseline stats
const stats = await kakunin.monitoring.getStats(agent.id, {
  window: '7d',
  metrics: ['transaction_size', 'frequency', 'counterparty_distribution', 'hour_of_day'],
});

// Approve and lock baseline
await kakunin.monitoring.setBaseline(agent.id, {
  transaction_size: { p50: stats.transaction_size.p50, p99: stats.transaction_size.p99 },
  transactions_per_hour: { p95: stats.frequency.p95 },
  preferred_counterparties: stats.counterparty_distribution.top(5),
  active_hours: stats.hour_of_day.activePeriods,
});

Anomaly Score Thresholds

ScoreBandAction
< 0.3LowAllow; log normally
0.3 – 0.74MediumAllow; increase log verbosity
>= 0.75HighIssue pre-revocation warning; page on-call
>= 0.85CriticalAuto-revoke certificate; halt all agent actions

These thresholds are configurable per agent. Adjust for risk tolerance and false-positive rate of the specific agent's task profile.

Detecting Specific Attack Patterns

Prompt injection leading to data exfiltration:

Compromised container (stolen cert used from outside):

Gradual objective drift:


Incident Response

Certificate Revocation

When an anomaly score exceeds 0.85, Kakunin issues a pre-revocation warning and schedules automatic revocation:

// Webhook received when anomaly threshold breached
app.post('/webhook/kakunin', async (req, res) => {
  const event = req.body;

  if (event.type === 'agent.pre_revocation_warning') {
    const { agent_id, risk_score, anomaly_details } = event.data;

    // 1. Page on-call
    await pagerduty.createIncident({
      title: `Agent ${agent_id} anomaly score ${risk_score}`,
      severity: 'high',
      details: anomaly_details,
    });

    // 2. Optional: suspend non-critical tasks while investigating
    await agentOrchestrator.pause(agent_id, { reason: 'anomaly_investigation' });

    res.json({ acknowledged: true });
  }

  if (event.type === 'agent.certificate_revoked') {
    const { agent_id, revocation_reason } = event.data;

    // 1. Hard stop — refuse all new tasks
    await agentOrchestrator.terminate(agent_id);

    // 2. Quarantine transactions from the anomaly window
    await compliance.flagForReview({
      agent_id,
      from: event.data.anomaly_start,
      to: event.data.revoked_at,
    });

    // 3. Spin up replacement agent (new certificate, fresh identity)
    await agentOrchestrator.spawn({
      template: agent_id,
      reason: 'post_revocation_replacement',
    });

    res.json({ acknowledged: true });
  }
});

Forensic Audit Trail

After an incident, the audit log provides complete reconstruction:

// Pull all actions for the affected agent during the incident window
const auditTrail = await supabase
  .from('audit_log')
  .select('*')
  .eq('tenant_id', tenantId)
  .eq('actor_id', agentId)
  .gte('created_at', incidentStart.toISOString())
  .lte('created_at', incidentEnd.toISOString())
  .order('created_at', { ascending: true });

// Each row contains:
// - Signed action payload (proves what the agent intended)
// - Signature (proves it was this agent's certificate)
// - Risk score at time of action
// - Tool call parameters
// - Outcome

Every row is WORM — no update, no delete. Satisfies EU AI Act Article 12 record-keeping and MiCA Article 71 audit trail requirements.


Infrastructure Hardening

Kubernetes Deployment Pattern

apiVersion: v1
kind: Pod
metadata:
  name: payment-agent-pod
  annotations:
    kakunin.ai/agent-id: "a_xyz123"
    kakunin.ai/certificate-fingerprint: "sha256:abc..."
spec:
  serviceAccountName: payment-agent-sa  # Minimal RBAC
  securityContext:
    runAsNonRoot: true
    runAsUser: 65534
    seccompProfile:
      type: RuntimeDefault
  containers:
  - name: agent
    image: myorg/payment-agent:v3.1.0
    securityContext:
      allowPrivilegeEscalation: false
      capabilities:
        drop: [ALL]
      readOnlyRootFilesystem: true
    env:
    - name: AGENT_ID
      value: "a_xyz123"
    - name: KMS_KEY_ARN
      valueFrom:
        secretKeyRef:
          name: kakunin-creds
          key: kms-key-arn
    # KAKUNIN_API_KEY from Doppler / external-secrets
    volumeMounts:
    - name: cert-volume
      mountPath: /var/certs
      readOnly: true
    - name: tmp
      mountPath: /tmp
  volumes:
  - name: cert-volume
    projected:
      sources:
      - secret:
          name: agent-certificate
  - name: tmp
    emptyDir: {}

Network Policy

Restrict egress to only required endpoints:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: payment-agent-egress
spec:
  podSelector:
    matchLabels:
      app: payment-agent
  policyTypes: [Egress]
  egress:
  - to:
    - ipBlock:
        cidr: 0.0.0.0/0  # Stripe, Kakunin API
    ports:
    - protocol: TCP
      port: 443
  # Block all other egress (no DNS exfiltration, no metadata endpoints)

Compliance Mapping

ControlKakunin FeatureRegulation
Agent identity documentationX.509 certificate with serial numberEU AI Act Art. 11, MiCA Art. 70
Authority limits enforcedScope policy in certificateEU AI Act Art. 26, MiCA Art. 67
Continuous monitoringBehavioral baseline + anomaly detectionEU AI Act Art. 9, MiCA Art. 72
Automatic haltAuto-revocation at score ≥ 0.85EU AI Act Art. 14 (human oversight)
Immutable audit trailWORM audit_log (no UPDATE/DELETE)EU AI Act Art. 12, MiCA Art. 71
Incident reportingWebhook + pre-revocation warningsEU AI Act Art. 73, MiCA Art. 67

What's Next?