perea.ai Research · 1.0 · Public draft

Hyperscaler Agent Runtimes 2026

AgentCore vs Foundry vs Agent Engine vs AgentKit — The Architectural Decision Matrix for Production Agent Deployment

AuthorDante Perea
PublishedMay 2026
Length2,659 words · 12 min read
AudiencePlatform engineering leads selecting an agent runtime, founders deciding which hyperscaler ecosystem to commit to, and architects designing production agent deployments who need pricing, isolation, framework, and observability comparisons in one place.
LicenseCC BY 4.0

#Hyperscaler Agent Runtimes 2026

#What this paper is, in one sentence

The four hyperscaler-managed agent runtimes — AWS Bedrock AgentCore (GA Oct 13 2025[1]), Microsoft Foundry Agent Service (GA March 2026[2]), Google Vertex Agent Engine on the Gemini Enterprise Agent Platform[3], and OpenAI AgentKit (DevDay 2025[4]) — converge on a similar consumption-based pricing shape ($0.0864–$0.0994[5][6][7] per vCPU-hour, $0.0090–$0.0118[5][6][7] per GiB-hour) and on a similar set of enabling primitives (session isolation, framework agnosticism, OpenTelemetry observability, MCP/A2A protocol support), and the architectural decision between them now comes down to four axes: ecosystem alignment, framework lock-in tolerance, pricing-model fit, and what isolation guarantee your workload actually needs.

#Why hyperscaler-managed runtimes exist

Roll-your-own agent infrastructure was the 2024 default. By 2025 the operational cost of building and maintaining it — session isolation, autoscaling from zero, multi-hour execution windows, identity management, observability, framework integration — had become substantial enough that all four hyperscalers shipped a managed runtime within a six-month window: AWS in October 2025[1], OpenAI also in October 2025[4], Google through 2025[3], and Microsoft in March 2026[8].

The shared architectural insight: agents spend 30–70%[9][5] of session wall-clock time in I/O wait — waiting for LLM responses, tool calls, or database queries. A traditional compute model that bills for allocated resources during that wait is mispriced for the workload by 30–70%[9]. All four hyperscalers responded with consumption-based pricing that bills only for active CPU during processing periods (and idle/wait time free or near-free), aligning cost with value creation.[5][6][7][9]

The shared list of enabling features that every runtime now ships:

  • Session isolation (microVM, container, or per-session sandbox) preventing cross-session data contamination.[1][10][11]

  • Multi-hour execution windows for long-running agent workflows (8 hours[1] AgentCore, comparable on Vertex/Foundry).

  • Framework agnosticism — every runtime supports LangChain/LangGraph, CrewAI, LlamaIndex plus its own native framework.[1][2][3][4]

  • OpenTelemetry-compatible observability with traces, metrics, dashboards.[12][13]

  • Native MCP and A2A protocol support (or Responses API for OpenAI).[1][14][15]

  • Identity / OAuth integration with enterprise IdPs.[1][2]

  • VPC/PrivateLink/BYO-VNet for enterprise-grade network isolation.[1][16]

The differences live in pricing per resource, framework defaults, cross-product integration, and a few architectural details — particularly the choice between direct code deployment (AgentCore, AgentKit) and container deployment (Foundry hosted agents, AgentCore container option).

#The pricing matrix

Numbers below are list price per vCPU-hour and GiB-hour at GA, unmodified by enterprise commits.

RuntimevCPU-hourGiB-hourFree TierBilling Granularity
AWS Bedrock AgentCore$0.0895[5]$0.00945[5]None (post-Sep 16 2025[9])Per second, 1-second min[5]
Microsoft Foundry Hosted$0.0994[7]$0.0118[7]30-day Azure free trial[7]Per hour[7]
Google Vertex Agent Engine$0.0864[6]$0.0090[6]50 vCPU-hr + 100 GiB-hr / month perpetual[6]Per second[6]
OpenAI AgentKit / Agents SDKToken-based[4]Token-based[4]API standard tierPer token[4]

The pricing pattern, decoded:[5][6][7]

  • Vertex is the cheapest list price ($0.0864/$0.0090) and has the most generous perpetual free tier (50 vCPU-hours and 100 GiB-hours per month per project[6]).
  • AgentCore is mid-list ($0.0895/$0.00945) with the most aggressive idle/wait-time discount: I/O wait and idle are FREE if no other background process runs, peak-memory billing only.[5][9]
  • Foundry is the most expensive at list ($0.0994/$0.0118) but offers the Agent Commit Units (ACU) pre-purchase plan with up to 15%[17] discount at the 500K-unit tier ($425K[17] for 500K ACUs).
  • AgentKit doesn't charge a separate runtime fee — pricing is folded into standard token-based API pricing.[4][15] This is structurally different from the other three.

Adjacent service pricing (where bundles diverge):[5][6][18]

  • AgentCore Gateway: $0.005[18] per 1,000 API calls; $0.025[18] per 1,000 search queries; $0.02[18] per 100 tools indexed/month.
  • AgentCore Memory: $0.25[18] per 1,000 short-term events; $0.75[18] per 1,000 long-term events/month; $0.50[18] per 1,000 retrievals.
  • Vertex Sessions: $0.20[6] per 1,000 events (billing started Feb 11, 2026[6]).
  • Vertex Memory Bank: $0.20[6] per 1,000 memories stored/month; $0.50[6] per 1,000 retrievals (first 1,000 free).
  • Foundry tools: File Search Storage $0.11[7]/GB/day (1 GB free); Code Interpreter $0.033[7]/session; Web Search $14[7]/1,000 transactions.

Worked example. A customer-support agent handling 10,000[9] sessions/day, each 60 seconds[9] wall-clock with 18 seconds[9] active CPU at 1 vCPU and 1.5–2.5 GB[9] memory. AgentCore-billed cost per session: 18s × ($0.0895/3600) + memory charges = $0.0007625[9] (a 70%[9] reduction vs allocation-based pricing on the same workload).[9] Across the four hyperscalers, this workload would land in the same low-thousandths-of-a-cent-per-session range — the differentiator at this scale is not unit cost but ecosystem fit.

#Session isolation: the architectural axis everyone competes on

AgentCore Runtime[1][10]: dedicated microVM per user session with isolated CPU, memory, filesystem. After session completion, the microVM terminates and memory is sanitized. "Deterministic security even when working with non-deterministic AI processes." Eight-hour execution windows, automatic scale-to-zero through scaling-to-thousands. Single, comprehensive SDK across Memory, Tools, Gateway, Identity, Observability, Browser Tool, Code Interpreter.[1][14]

Foundry Hosted Agents (preview/GA[11]): per-session VM-isolated sandboxes with persistent filesystem ($HOME and /files), enabling scale-to-zero with stateful resume. Each session gets a dedicated sandbox; sessions are isolated from each other; state is automatically restored when a session resumes after going idle. CPU/memory range: 0.25 vCPU/0.5 GiB to 2 vCPU/4 GiB[11]. Default limit 50[11] concurrent sessions per subscription per region (adjustable). Deployed via Azure Container Registry; runs in Microsoft-managed infrastructure; container subnet delegated to Microsoft.App/environments, /27 minimum, /24 default.[16]

Vertex Agent Engine[3][19]: managed runtime with serverless efficiency, billed per second. Idle time not billed. Deploy via adk deploy single-command CLI[19]. Hundreds of thousands[19] of agents already deployed since launch. Scale from zero to global through Google's serverless infrastructure. Native A2A protocol support; secure code execution sandbox; observability via Google Cloud Trace.[19]

AgentKit / Agents SDK[4][15]: the OpenAI runtime is structurally different — there is no managed compute layer separate from the API. Sandbox agents (v0.14.0+[20]) ship in the Agents SDK with native sandbox execution, Manifest abstraction for workspace, snapshotting, and rehydration that supports recovery if the original sandbox container fails. Storage providers: AWS S3, Google Cloud Storage, Azure Blob, Cloudflare R2[20]. Durable execution via Temporal, Restate, or DBOS integrations[21]. Python first; TypeScript planned; v0.15.1[22] (May 2 2026), MIT, 25,912[22] GitHub stars, 270[22] contributors.

The isolation tradeoff that decides workloads. AgentCore's microVM is the strongest isolation guarantee — true VM-level separation, deterministic sanitization, designed explicitly for "non-deterministic AI processes."[10] Foundry's per-session sandbox is comparable for most workloads but uses container-level isolation. Vertex is serverless-managed (isolation guarantees not as explicitly documented at the microVM level). AgentKit gives developers explicit choice — run sandbox locally, run sandbox managed, or skip sandbox entirely — at the cost of moving the isolation responsibility to the developer.[4][20]

#Framework support: who hosts whom

RuntimeNative frameworksOSS frameworks supported
AgentCoreStrands Agents (AWS-developed)LangChain, LangGraph, CrewAI, LlamaIndex, Google ADK, OpenAI Agents SDK[1][14]
FoundryMicrosoft Agent Framework, Semantic KernelLangChain, LangGraph, OpenAI Agents SDK[2][8]
Vertex AEGoogle ADKLangChain, LangGraph, LlamaIndex, AG2, custom[3][19]
AgentKitAgent Builder (visual), Agents SDK (code)100+ LLM providers[22] via Agents SDK abstractions; Responses-API-compatible

The framework lesson. All four runtimes support the major OSS frameworks (LangChain/LangGraph at minimum), but each has a native preference. AgentCore is the most framework-agnostic by design; Foundry is wire-compatible with the OpenAI Responses API[8] (the cleanest migration path off OpenAI's hosted runtime); Vertex is most opinionated toward Google ADK but supports A2A interop; AgentKit assumes you're using the Agents SDK or its visual Agent Builder.[4][15]

#Observability and protocol surface

All four ship OpenTelemetry-compatible tracing.[12][13] AgentCore Observability emits OTEL natively and integrates with Amazon CloudWatch + external providers (Dynatrace, Datadog, Arize Phoenix, LangSmith, Langfuse[1]). Foundry's observability runs through Foundry Control Plane with Azure Monitor, continuous evaluation, end-to-end tracing GA in March 2026[8]. Vertex Agent Engine integrates with Google Cloud Trace, custom metrics, alerts.[19] AgentKit ships with Evals (datasets, trace grading, automated prompt optimization, third-party model support[15]).

Protocol-level support, mid-2026:[1][2][3][4]

  • AgentCore: A2A protocol native; MCP servers via Gateway; OAuth + IAM authorization; broader A2A coming across all services.[1]
  • Foundry: Responses-API native; MCP over private network paths; OAuth passthrough.[8][16]
  • Vertex AE: A2A native; ADK CLI + adk deploy.[19]
  • AgentKit: Responses API + MCP via Connector Registry.[4][15]

#Compliance, security, and enterprise networking

AgentCore[1]: VPC + AWS PrivateLink + AWS CloudFormation + resource tagging + IAM (in addition to OAuth) + Cedar policy language; nine[1] AWS Regions at GA (Mumbai, Singapore, Sydney, Tokyo, Dublin, Frankfurt, N. Virginia, Ohio, Oregon).

Foundry[16][7]: Standard Setup with private networking — BYO VNet, no public egress, container/subnet injection. Microsoft Entra Agent ID for identity. Azure Storage / Cosmos DB / AI Search for stateful entity persistence.

Vertex AE[3]: Google Cloud's standard enterprise primitives — VPC Service Controls, IAM, Customer-Managed Encryption Keys.

AgentKit[4]: standard OpenAI API authentication and Responses API session management — enterprise networking is the developer's responsibility.

#The four-axis decision tree

Picking a runtime in 2026 reduces to four sequential decisions:

1. Ecosystem. What identity provider does your org use? What cloud do your data and tools already live in? AWS-shop → AgentCore; Microsoft 365/Azure-shop → Foundry; GCP/Workspace-shop → Vertex AE; OpenAI-API-first → AgentKit. The cross-cloud cost (data transfer, identity federation, observability fragmentation) is rarely worth fighting for an extra 10%[7] cheaper compute.

2. Framework. What framework is your team productive in today? LangGraph + LangChain → all four work; OpenAI Agents SDK → AgentKit native, all three others supported; Google ADK → Vertex AE native; Microsoft Agent Framework → Foundry native; Strands → AgentCore native. "Native" buys zero-config tracing and the smoothest deploy path.

3. Pricing fit. What does your I/O wait pattern look like? Heavy I/O wait → AgentCore (idle = free); steady CPU → all four are roughly equivalent; bursty + occasional → Vertex's perpetual free tier is genuinely valuable; token-heavy + light compute → AgentKit (no separate compute layer).[5][6][7][9]

4. Isolation. What's the threat model? Multi-tenant with cross-customer leakage as the failure mode → AgentCore microVM is strongest. Internal enterprise with network-layer isolation sufficient → Foundry container-per-session works. Single-tenant high-throughput → Vertex serverless is fine. Developer-controlled environment → AgentKit puts the choice on you.

#Five anti-patterns

1. Picking on raw vCPU price alone. Compute is rarely the dominant cost — model tokens, vector storage, and tool invocations usually dominate.[5][6][17]

2. Underweighting framework lock-in. A "native" framework is faster on day 1 but costs more on day 365 when you want to move. AgentCore's deliberate framework-agnosticism is a hedge worth pricing.[1][14]

3. Skipping the I/O-wait analysis. AgentCore's pricing model materially over-rewards I/O-heavy workloads (~70%[9] cheaper for typical RAG-style sessions). If your sessions are ≥30%[9] I/O wait, the math is different from the headline rate.

4. Treating Vertex's free tier as marketing. 50 vCPU-hours + 100 GiB-hours per month per project, perpetual.[6] Real teams ship production agents within that envelope and never see a runtime bill.

5. Choosing AgentKit when you actually want a managed runtime. AgentKit is exceptional for OpenAI-first stacks with the Agents SDK, but it does not provide the same managed-compute, multi-region, PrivateLink, BYO-VNet enterprise primitives the other three do.[4][15] If your security review requires that surface, you want one of the cloud-native runtimes.

#What this paper does not cover

This paper does not cover: detailed comparison of memory/state services across runtimes (worth a separate deep-dive), per-region availability and pricing variation outside US/EU/APAC core, the long-running cron-style agent scheduler primitives that each cloud is starting to ship, hybrid-cloud agent deployment patterns spanning two hyperscalers, or specific compliance certification matrices (HIPAA, FedRAMP, SOC 2 Type 2 by tier).

#References

References

  1. AWS, Amazon Bedrock AgentCore is now generally available. https://aws.amazon.com/about-aws/whats-new/2025/10/amazon-bedrock-agentcore-available/ (Oct 13, 2025) 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17

  2. Microsoft Foundry, Foundry Agent Service product page. https://azure.microsoft.com/en-us/products/ai-agent-service 2 3 4 5

  3. Google Cloud, Gemini Enterprise Agent Platform (formerly Vertex AI). https://cloud.google.com/products/gemini-enterprise-agent-platform 2 3 4 5 6 7

  4. OpenAI, Introducing AgentKit. https://openai.com/index/introducing-agentkit/ (Oct 6, 2025) 2 3 4 5 6 7 8 9 10 11 12 13 14

  5. AWS, Amazon Bedrock AgentCore Pricing. https://aws.amazon.com/bedrock/agentcore/pricing/ 2 3 4 5 6 7 8 9 10 11 12

  6. Google Cloud, Gemini Enterprise Agent Platform pricing. https://cloud.google.com/vertex-ai/pricing?hl=en 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17

  7. Microsoft Azure, Foundry Agent Service Pricing. https://azure.microsoft.com/en-us/pricing/details/foundry-agent-service/ 2 3 4 5 6 7 8 9 10 11 12 13 14

  8. Microsoft, What's new in Microsoft Foundry March 2026 (Foundry Agent Service GA). https://devblogs.microsoft.com/foundry/whats-new-in-microsoft-foundry-mar-2026 (Apr 9, 2026) 2 3 4 5

  9. AWS, Securely launch and scale your agents on Amazon Bedrock AgentCore Runtime. https://aws.amazon.com/blogs/machine-learning/securely-launch-and-scale-your-agents-and-tools-on-amazon-bedrock-agentcore-runtime (Aug 13, 2025) 2 3 4 5 6 7 8 9 10 11 12 13 14 15

  10. AWS, Host agent or tools with Amazon Bedrock AgentCore Runtime. https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/agents-tools-runtime.html 2 3

  11. Microsoft Learn, Hosted agents in Foundry Agent Service. https://learn.microsoft.com/en-us/azure/ai-foundry/agents/concepts/hosted-agents?view=foundry 2 3 4

  12. Microsoft Azure, Observability in Foundry Control Plane. https://azure.microsoft.com/en-ca/products/ai-foundry/observability 2

  13. Microsoft, Foundry Agent Service, Observability, and Foundry Portal Now Generally Available. https://techcommunity.microsoft.com/t5/microsoft-foundry-blog/building-production-ready-secure-observable-ai-agents-with-real/ba-p/4501074 2

  14. AWS, Make agents a reality with Amazon Bedrock AgentCore now generally available. https://aws.amazon.com/blogs/machine-learning/amazon-bedrock-agentcore-is-now-generally-available/ (Oct 13, 2025) 2 3 4

  15. OpenAI, Build, deploy, and optimize agentic workflows with AgentKit (cookbook). https://developers.openai.com/cookbook/examples/agentkit/agentkit_walkthrough/ 2 3 4 5 6 7

  16. Microsoft Learn, Foundry Agent Service FAQ (networking, VNet, subnets). https://learn.microsoft.com/en-us/azure/foundry/agents/faq 2 3 4

  17. Microsoft Azure, Microsoft Foundry pricing — Agent Commit Units (ACU) plan. https://azure.microsoft.com/en-us/pricing/details/ai-foundry/ 2 3

  18. pump.co, Amazon Bedrock AgentCore: Pricing, Features & How It Works. https://www.pump.co/blog/amazon-bedrock-agentcore/ 2 3 4 5 6 7

  19. Google Cloud Blog, More ways to build and scale AI agents with Vertex AI Agent Builder. https://cloud.google.com/blog/products/ai-machine-learning/more-ways-to-build-and-scale-ai-agents-with-vertex-ai-agent-builder (Nov 6, 2025) 2 3 4 5 6 7

  20. OpenAI, The next evolution of the Agents SDK. https://openai.com/index/the-next-evolution-of-the-agents-sdk/ (Apr 15, 2026) 2 3

  21. OpenAI, Running agents — durable execution integrations (Temporal, Restate, DBOS). https://openai.github.io/openai-agents-python/running_agents/

  22. openai/openai-agents-python GitHub. https://www.github.com/openai/openai-agents-python 2 3 4

perea.ai Research

One deep piece a month. Three weekly signals.

Get every B2A field report, protocol update, and benchmark from real audits — published before the rest of the market sees it. No filler. Unsubscribe in one click.