Why your cloud architecture needs to change before your agents go to production
The Shift
From AI that answers → AI that acts
AI Assistants (Yesterday)
Answer questions
Summarize documents
Augment human decisions
Human remains in the loop
AI Agents (Now)
Take actions: write code, call APIs, trigger workflows
Operate autonomously across long time horizons
Orchestrate other agents
Human is out of the loop — by design
The infrastructure that worked for an AI assistant — a stateless API call — doesn't work for an agent that runs for hours, spawns subagents, and touches your production systems.
The Bottleneck
The problem isn't the model. It's the infrastructure around it.
Engineering leaders tell us the same things. The model works. The demo looks great. Then production hits three walls:
Security
Agents execute code and call APIs — who authorized that?
Non-human identities have no standard auth model
Prompt injection and tool misuse are unsolved at scale
Scale
Agents are long-running, stateful, concurrent
Traditional serverless is ephemeral — no memory, no persistence
10 developers running 5 agents each = 50 concurrent workflows, all day
Governance
Which agents can access which systems?
How do you audit what an agent did and why?
MCP and other tool-calling protocols have no enterprise controls
These aren't AI problems. They're infrastructure problems. And they need infrastructure answers.
The New Architecture
What agent-native infrastructure actually requires
Compute
Persistent execution environments
Isolated sandboxes per agent / task
Version-controlled code artifacts
Concurrent workflow orchestration at scale
Security — Zero Trust for Agents
Identity for non-human workloads
Private networking between agents and services
Policy enforcement at every tool call
Auditable execution trails
Tooling
Persistent memory across agent sessions
Multi-model access (avoid vendor lock-in)
Voice / email / communication primitives
SDK that abstracts agent complexity
The architecture diagram looks simple. The integration is where teams get stuck. You need these three layers to talk to each other from day one — not stitched together later.
Cloudflare's Approach
Cloud 2.0 — Infrastructure designed for the agentic era
We call this Cloud 2.0. Not because the old cloud is going away — but because the old cloud wasn't designed for agents.
Edge-native by design
Agents run closest to where work happens — low latency, global distribution, no cold-start penalty for persistent workloads.
Zero-trust by default
Every agent action is authenticated and policy-governed. Non-human identities are first-class. Security isn't a layer you add — it's the substrate.
Integrated, not assembled
Compute, security, and tooling from one platform. No glue code. No three-vendor blame-shifting when something breaks.
The same properties that made Cloudflare the right choice for securing the web make it the right shape for agent infrastructure: global, secure, programmable at the edge.
Key Capabilities
Four pillars — and what they mean for your teams
Persistent, Isolated Compute
Agents run in sandboxed Linux environments with versioned code storage, capable of 50,000 concurrent workflows.
Your agents can run production workloads — not just demos. Scale is not a ceiling you'll hit on day one.
Zero Trust for Non-Human Identities
Machine-to-machine auth built on open standards (RFC 9728). Private networking connects agents to internal services without public exposure.
Your security team can approve agents for production. No more "we don't have an auth model for this."
Governed Tool Access (MCP)
Enterprise governance layer for Model Context Protocol. Define which agents can call which tools, with full audit logs.
When your compliance team asks "what did the agent do?" — you can answer.
Unified AI Platform + Memory
Access to 14+ model providers through a single interface, plus persistent memory that survives across agent sessions.
You're not locked into one model vendor. And your agents remember context — they get smarter over time, not just faster.
Enterprise Readiness
The questions your security and compliance teams will ask — answered
How do we control which agents can access which internal systems?
Private mesh networking keeps agent traffic off the public internet. Policy enforcement at the tool layer means access is granted per-capability, not per-agent broadly.
What happens when an agent is compromised or acts unexpectedly?
Isolated execution environments mean blast radius is contained by default. Every action is logged. You can audit, replay, and revoke.
Our compliance team will never approve a third-party system calling our internal APIs.
RFC 9728 Managed OAuth gives agents verifiable, revocable identities — the same model your compliance team uses for human SSO. Non-human identities are governed, not assumed.
This is what enterprise-grade agent infrastructure looks like
Getting Started
A 90-day pathway from exploration to production
01
Identify
Map existing AI usage and pain points
Pick one use case with clear success criteria
Engage security and compliance early
Clear scope — not "all agents"
02
Pilot
Stand up a sandboxed compute environment
Implement machine identity and audit logging from day one
Run the agent against a non-production system
Security team has signed off
03
Scale
Expand to production with governance controls in place
Connect persistent memory and multi-model access
Define runbooks for agent failures and escalations
Repeatable pattern for future agents
Most teams start with internal tooling — code review, documentation, ticket triage. High value, lower risk, and a natural on-ramp to the governance model you'll need for customer-facing agents.
Discussion
Three questions worth bringing back to your team
Where are your agents in the journey — demo, pilot, or heading toward production? What's the blocker?
Does your current cloud architecture have an answer for non-human identity and agent governance?
Which internal use case would deliver the most value in the next 90 days — and what would it take to get your security team to yes?