The Agent Infrastructure Conversation

The Agent Infrastructure
Conversation

Why your cloud architecture needs to change before your agents go to production

The Shift

From AI that answers → AI that acts

AI Assistants (Yesterday)

Answer questions
Summarize documents
Augment human decisions
Human remains in the loop

AI Agents (Now)

Take actions: write code, call APIs, trigger workflows
Operate autonomously across long time horizons
Orchestrate other agents
Human is out of the loop — by design

The infrastructure that worked for an AI assistant — a stateless API call — doesn't work for an agent that runs for hours, spawns subagents, and touches your production systems.

The Bottleneck

The problem isn't the model. It's the infrastructure around it.

Engineering leaders tell us the same things. The model works. The demo looks great. Then production hits three walls:

Security

Agents execute code and call APIs — who authorized that?
Non-human identities have no standard auth model
Prompt injection and tool misuse are unsolved at scale

Scale

Agents are long-running, stateful, concurrent
Traditional serverless is ephemeral — no memory, no persistence
10 developers running 5 agents each = 50 concurrent workflows, all day

Governance

Which agents can access which systems?
How do you audit what an agent did and why?
MCP and other tool-calling protocols have no enterprise controls

These aren't AI problems. They're infrastructure problems. And they need infrastructure answers.

The New Architecture

What agent-native infrastructure actually requires

Compute

Persistent execution environments

Isolated sandboxes per agent / task

Version-controlled code artifacts

Concurrent workflow orchestration at scale

Security — Zero Trust for Agents

Identity for non-human workloads

Private networking between agents and services

Policy enforcement at every tool call

Auditable execution trails

Tooling

Persistent memory across agent sessions

Multi-model access (avoid vendor lock-in)

Voice / email / communication primitives

SDK that abstracts agent complexity

The architecture diagram looks simple. The integration is where teams get stuck. You need these three layers to talk to each other from day one — not stitched together later.

Cloudflare's Approach

Cloud 2.0 — Infrastructure designed for the agentic era

We call this Cloud 2.0. Not because the old cloud is going away — but because the old cloud wasn't designed for agents.

Edge-native by design

Agents run closest to where work happens — low latency, global distribution, no cold-start penalty for persistent workloads.

Zero-trust by default

Every agent action is authenticated and policy-governed. Non-human identities are first-class. Security isn't a layer you add — it's the substrate.

Integrated, not assembled

Compute, security, and tooling from one platform. No glue code. No three-vendor blame-shifting when something breaks.

Key Capabilities

Four pillars — and what they mean for your teams

Persistent, Isolated Compute

Agents run in sandboxed Linux environments with versioned code storage, capable of 50,000 concurrent workflows.

Your agents can run production workloads — not just demos. Scale is not a ceiling you'll hit on day one.

Zero Trust for Non-Human Identities

Machine-to-machine auth built on open standards (RFC 9728). Private networking connects agents to internal services without public exposure.

Your security team can approve agents for production. No more "we don't have an auth model for this."

Governed Tool Access (MCP)

Enterprise governance layer for Model Context Protocol. Define which agents can call which tools, with full audit logs.

When your compliance team asks "what did the agent do?" — you can answer.

Unified AI Platform + Memory

Access to 14+ model providers through a single interface, plus persistent memory that survives across agent sessions.

You're not locked into one model vendor. And your agents remember context — they get smarter over time, not just faster.

Enterprise Readiness

The questions your security and compliance teams will ask — answered

How do we control which agents can access which internal systems?

Private mesh networking keeps agent traffic off the public internet. Policy enforcement at the tool layer means access is granted per-capability, not per-agent broadly.

What happens when an agent is compromised or acts unexpectedly?

Isolated execution environments mean blast radius is contained by default. Every action is logged. You can audit, replay, and revoke.

Our compliance team will never approve a third-party system calling our internal APIs.

RFC 9728 Managed OAuth gives agents verifiable, revocable identities — the same model your compliance team uses for human SSO. Non-human identities are governed, not assumed.

This is what enterprise-grade agent infrastructure looks like

Getting Started

A 90-day pathway from exploration to production

Identify

Map existing AI usage and pain points
Pick one use case with clear success criteria
Engage security and compliance early

Clear scope — not "all agents"

Pilot

Stand up a sandboxed compute environment
Implement machine identity and audit logging from day one
Run the agent against a non-production system

Security team has signed off

Scale

Expand to production with governance controls in place
Connect persistent memory and multi-model access
Define runbooks for agent failures and escalations

Repeatable pattern for future agents

Most teams start with internal tooling — code review, documentation, ticket triage. High value, lower risk, and a natural on-ramp to the governance model you'll need for customer-facing agents.

Discussion

Three questions worth bringing back to your team

Where are your agents in the journey — demo, pilot, or heading toward production? What's the blocker?

Does your current cloud architecture have an answer for non-human identity and agent governance?

Which internal use case would deliver the most value in the next 90 days — and what would it take to get your security team to yes?