Skip to content

Banner image Banner image

Policy as Code + Agents: Guardrails That Actually Hold

Our platform agent opened a PR that would have exposed a database security group to 0.0.0.0/0. And here's the thing — it was doing exactly what we asked. The prompt said "make this service accessible from the VPN." The agent heard "open ingress from anywhere." Technically? Both solve the same problem. The agent wasn't wrong. It just didn't know why the constraint existed.

Without a policy gate, that change would have gone through. Nobody would have caught it until someone ran a security audit, or worse. With one, it didn't — the agent got the policy failure back as feedback and tried a more targeted fix instead.

That's what policy as code looks like in an agentic system. Not blocking humans from making mistakes — they've got code review for that — but catching what agents get wrong before a human ever sees the PR. For the change management tier model that determines which agent changes need policy checks in the first place, see Agentic Change Management.

C4 Architecture Diagram C4 Architecture Diagram


Quick takeaways

  • Policy validation should run before the PR review — not after, when someone's already given it their time
  • Agents that get structured policy feedback can self-correct before they ever open a PR
  • Kyverno handles Kubernetes resources; OPA handles everything else
  • The goal isn't to slow agents down — it's to make what they produce actually trustworthy

Self-correction changes the economics

Agents that receive structured policy failures and self-correct before opening a PR save reviewer time on every loop. In practice, 60–70% of simple violations (missing labels, wrong image registry, absent resource limits) resolve on the second attempt without any human involvement.


Where agents hit policy limits

Agents don't make infrastructure mistakes because they're reckless. They make them because they're optimising for "make it work" — and they genuinely don't know why the constraint exists. Here's what that looks like in practice:

  • Network exposure: security groups or ingress rules opened too broadly because "accessible" got interpreted as "from anywhere"
  • Resource limits absent: workloads deployed without CPU/memory limits because the task was to get it running, not to budget it
  • Privileged containers: securityContext.privileged: true because something wouldn't start otherwise and that fixed it
  • Missing labels: monitoring and cost allocation broken because nobody told the agent which labels were required
  • Secret handling: credentials hardcoded into environment variables because that's the path of least resistance

None of this is malicious. It's entirely predictable. An agent with no policy signal will always choose the path that satisfies the prompt — and your security constraints aren't in the prompt.

The silent compliance gap

An agent without a policy gate isn't ignoring your security rules — it doesn't know they exist. The gap is invisible until a vulnerability scan or incident surfaces it. By then, multiple agent PRs may have landed with the same flaw.


1) Kyverno for Kubernetes changes

Kyverno runs as an admission controller. Any Kubernetes manifest an agent tries to apply — directly or via ArgoCD sync — gets validated before it lands. Nothing slips through at apply time.

But if your agent is proposing changes as PRs rather than applying them directly, you want to catch violations even earlier. Run Kyverno in dry-run mode against the manifests in CI:

# kyverno/policies/require-resource-limits.yaml
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: require-resource-limits
  annotations:
    policies.kyverno.io/description: >
      All containers must have CPU and memory limits defined.
      This prevents a single workload from consuming cluster resources.
spec:
  validationFailureAction: Enforce
  rules:
    - name: check-container-limits
      match:
        any:
          - resources:
              kinds: [Pod]
      validate:
        message: "All containers must define resource limits."
        pattern:
          spec:
            containers:
              - resources:
                  limits:
                    cpu: "?*"
                    memory: "?*"
# kyverno/policies/restrict-network-exposure.yaml
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: restrict-ingress-cidr
spec:
  validationFailureAction: Enforce
  rules:
    - name: no-public-ingress
      match:
        any:
          - resources:
              kinds: [NetworkPolicy]
      validate:
        message: "Ingress from 0.0.0.0/0 is not permitted. Use a specific CIDR range."
        deny:
          conditions:
            any:
              - key: "{{ request.object.spec.ingress[].from[].ipBlock.cidr }}"
                operator: AnyIn
                value: ["0.0.0.0/0", "::/0"]

Apply this: two-layer enforcement

Run Kyverno in both CI (as a CLI check against manifests in the PR) and as a live admission controller in the cluster. CI catches violations before merge. The admission controller catches anything that arrives through a non-PR path — a direct kubectl apply, a Helm release, a GitOps sync from a misconfigured tool.


2) Running Kyverno in CI against agent PRs

# .github/workflows/policy-gate.yml
name: Policy Gate

on:
  pull_request:
    paths: ['k8s/**', 'charts/**', 'manifests/**']

jobs:
  kyverno-check:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Install Kyverno CLI
        run: |
          curl -LO https://github.com/kyverno/kyverno/releases/latest/download/kyverno-cli_linux_x86_64.tar.gz
          tar -xzf kyverno-cli_linux_x86_64.tar.gz
          mv kyverno /usr/local/bin/

      - name: Validate manifests against policies
        run: |
          kyverno apply kyverno/policies/ \
            --resource k8s/ \
            --output-format json > policy-results.json

          # Fail if any policy violations found
          python scripts/check-policy-results.py policy-results.json

      - name: Comment policy violations on PR
        if: failure()
        env:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
        run: python scripts/comment-policy-failures.py policy-results.json

This is the same CI gate that validates fix PRs opened by the GitOps drift detection agent — one workflow, catching violations from multiple sources.

Make it a required status check

Set the policy gate as a required status check on protected branches. That means it can't be bypassed without admin access — and if someone with admin access is bypassing it, you'd rather know. Required checks also prevent the silent "I'll fix it after merge" drift.


3) OPA for Terraform changes

Terraform is where the really dangerous stuff lives — security groups, database configs, network rules. OPA with Conftest validates the plan before apply, so you catch it while it's still just JSON:

# policies/terraform/security-groups.rego
package terraform.security_groups

import future.keywords.in

deny[msg] {
  resource := input.planned_values.root_module.resources[_]
  resource.type == "aws_security_group_rule"
  resource.values.cidr_blocks[_] == "0.0.0.0/0"
  resource.values.type == "ingress"

  msg := sprintf(
    "Security group rule '%s' allows ingress from 0.0.0.0/0. Use a specific CIDR.",
    [resource.name]
  )
}

deny[msg] {
  resource := input.planned_values.root_module.resources[_]
  resource.type == "aws_db_instance"
  resource.values.publicly_accessible == true

  msg := sprintf(
    "RDS instance '%s' is publicly accessible. Set publicly_accessible = false.",
    [resource.name]
  )
}
# Run in CI after terraform plan:
terraform plan -out=tfplan
terraform show -json tfplan > tfplan.json
conftest test tfplan.json --policy policies/terraform/

If you want to go deeper on how Kyverno governance works at scale across multiple clusters, the KubeCon session covers exactly that: Agentic Policy Management: Kyverno, MCP, and Closed-Loop Governance.

Apply this: Rego for everything non-Kubernetes

OPA's strength is that it evaluates any JSON structure. Terraform plan, Helm values, AWS CloudFormation templates, custom config files — all of it can be checked with the same Rego policy files. One policy language for the entire infrastructure surface.


4) Feeding policy failures back to the agent

Here's where it gets genuinely useful. Not just blocking bad output, but closing the loop — the agent gets the violation, understands what's wrong, and tries again. No human in the middle:

# scripts/agent-with-policy-feedback.py
import anthropic
import subprocess
import json

client = anthropic.Anthropic()

def generate_and_validate_manifest(requirement: str, max_attempts: int = 3) -> str:
    messages = [{
        "role": "user",
        "content": f"Generate a Kubernetes deployment manifest for: {requirement}"
    }]

    for attempt in range(max_attempts):
        response = client.messages.create(
            model="claude-opus-4-6",
            max_tokens=2048,
            system="""You are a Kubernetes manifest generator. Generate valid, 
            production-ready manifests that include resource limits, security contexts, 
            and required labels. Do not use privileged containers.""",
            messages=messages
        )

        manifest = response.content[0].text

        # Validate against Kyverno policies
        result = subprocess.run(
            ["kyverno", "apply", "kyverno/policies/", "--resource-stdin"],
            input=manifest,
            capture_output=True,
            text=True
        )

        if result.returncode == 0:
            return manifest  # passes policy

        # Policy failed - feed the violation back
        messages.append({"role": "assistant", "content": manifest})
        messages.append({
            "role": "user",
            "content": f"""The manifest violates the following policies:

{result.stdout}

Please fix the manifest to comply with these policies and try again."""
        })

    raise ValueError(f"Could not generate compliant manifest after {max_attempts} attempts")

Cap the retry loop

Set a hard limit on self-correction attempts — three is usually enough. If an agent can't produce a compliant manifest in three tries, the violation is likely ambiguous or requires human context to resolve. Uncapped loops waste API budget and delay escalation on the cases that actually need human judgement.


Common mistakes

"We'll handle policy in the admission controller, not CI."
Admission controllers catch violations at apply time — which is after the PR is already merged, reviewed, and approved. CI policy gates catch them before merge. Before a human spent 20 minutes reviewing something that was always going to be rejected. These aren't alternatives; they're different gates for different moments.

"Our policies are too strict for agents."
Honestly, if your policies are blocking legitimate changes, that's a signal the policies need parameters — not that they need removing. A policy that says "no public databases" can absolutely allow exceptions with a specific annotation and a written justification. That's not loosening the policy; that's making it honest about when exceptions are acceptable.

"Agents will just disable the policy check."
Make the policy check a required status check on the branch. It can't be bypassed without admin access. If someone with admin access is disabling it, that's a different conversation — and one you'd rather know about.


Frequently asked questions

What is policy as code and why does it matter for AI agents?

Policy as code means your governance rules live as machine-readable files — version-controlled, tested, enforced automatically in CI and at admission time. For agents specifically, it matters because agents move fast. They can propose dozens of changes before a human reviews any of them. Without enforcement in the pipeline, you're just hoping the agent happens to know your security rules. That doesn't scale.

What's the difference between Kyverno and OPA?

Kyverno is Kubernetes-native. It runs as an admission controller, and you write your rules in YAML, which makes it approachable if you're already living in Kubernetes manifests. OPA is language-agnostic — it uses Rego and can evaluate any JSON document, which makes it the right fit for Terraform plans and anything that isn't a Kubernetes resource. Use both: Kyverno for the cluster, OPA for everything else.

How do you enforce OPA policy against Terraform changes?

Convert your Terraform plan to JSON with terraform show -json, then pass it to conftest (a thin OPA wrapper) or the OPA HTTP API. Your Rego policy defines what's allowed — for example, blocking any security group rule that allows ingress from 0.0.0.0/0. Run it as a GitHub Actions step before terraform apply and the plan never gets applied if it violates policy.

Can an AI agent correct itself when it violates a policy?

Yes, and it works surprisingly well for clear violations. Feed the policy failure message back as a new prompt — "this violated these policies, please fix it" — and the agent usually produces a compliant version on the second or third attempt. For simple stuff like missing labels or wrong image registries, this is fast. For complex violations, escalate. Don't let the agent loop indefinitely trying to solve something that needs human judgement.

Do you need both CI policy gates and an admission controller?

Yes — they catch different failure modes at different times. CI gates catch violations before merge, saving reviewer time. Admission controllers catch anything that arrives through a non-PR path: direct kubectl applies, Helm releases, misconfigured GitOps tools. The combination means you have defence in depth rather than a single point of enforcement.


What you get

  • Agent-generated PRs that would have failed policy are caught before a human ever sees them — no wasted review time
  • Agents that self-correct are genuinely faster than agents that escalate every small violation to a human
  • Every policy violation in CI is a record: what the agent tried, what failed, why it was rejected — useful for debugging and for audits
  • Platform teams can add new policies without worrying about agent behaviour silently changing overnight

Walkthrough files

  • kyverno/policies/require-resource-limits.yaml — enforce CPU/memory limits on all Pods
  • kyverno/policies/restrict-network-exposure.yaml — block open CIDR ingress
  • policies/terraform/security-groups.rego — OPA rules for Terraform security group and RDS config
  • .github/workflows/policy-gate.yml — CI workflow validating manifests against Kyverno policies
  • scripts/agent-with-policy-feedback.py — agent loop with policy validation and self-correction

Not every agent change needs the same level of scrutiny. For the tier model that decides which changes need policy checks in the first place, see Agentic Change Management.