llm-wiki wiki · sources 2026-05-13

agentgateway 架构与设计思路分析

2026-05-13 · 来源:agentgateway-architecture-analysis.md

architectureai-infracloud-nativegatewayllmmcp

原文:raw/agentgateway-architecture-analysis.md · 仓库:agentgateway/agentgateway · 分析版本 v1.2.0-alpha.2+24(HEAD 9ca3e04

一句话定位

agentgateway 是面向 AI 流量的 L7 代理:把 LLM API(OpenAI / Anthropic / Gemini / Bedrock / Vertex / Azure / Copilot)、MCP 工具调用A2A Agent 间通信 统一在一个 Rust 数据面里,用 Gateway API CRD + xDS 做声明式控制面分发。本质上是「Istio 的控制面骨架 + 为 AI 协议特化的 Rust 数据面」的组合——把 mesh 经过验证的反应式配置分发、KRT 反应式表、HBONE mTLS 隧道全套基建搬来做 AI Gateway。

核心架构图

flowchart TB K8s[K8s API
kubectl apply CRDs] subgraph CP["CONTROL PLANE — Go (controller/)"] direction TB Watch["Watch Gateway / HTTPRoute
+ AGPol / AGBe / AGParams"] KRT["KRT collections
(Istio reactive tables)"] Trans["Translator
K8s → xDS resource.Resource"] ADS["ADS Server
SotW + Delta"] Deployer["Deployer
Helm / Pod"] Watch --> KRT --> Trans --> ADS ADS <--> Deployer end subgraph DP["DATA PLANE — Rust (crates/agentgateway*)"] direction TB App["agentgateway-app/
thin binary (signal/tracing)"] Lib["agentgateway/lib"] Ctrl["control/
xDS AdsClient (delta)"] State["state_manager + store/
Stores { binds, discovery, ads }"] subgraph Proxy["proxy/"] Gw["gateway.rs
TCP accept loop"] Http["httpproxy.rs
route + policy"] Tcp["tcpproxy.rs"] Pool["pool (connection)"] end HTTP["http/
filters · retry · transform"] Tport["transport/
HBONE = mTLS HTTP/2"] LLM["llm/
7 providers
AIProvider enum + Provider trait
OpenAI-compat unify"] MCP["mcp/
App/Relay + upstream/
stdio · HTTP · SSE · Streamable
federation · CEL RBAC"] A2A["a2a/
classifier + URL rewriter"] CEL["cel/
cel-fork + celx"] Tel["telemetry/
OTLP + access log"] Mgmt["management/
admin + debug + ui"] App --> Lib Lib --> Ctrl & State & Proxy & HTTP & Tport & LLM & MCP & A2A & CEL & Tel & Mgmt end subgraph UI["UI — TypeScript (ui/, Next.js)"] UIH["/ui/ HTTP → management → live config view"] end K8s --> Watch ADS -- "gRPC DeltaAggregatedResources
resource.Resource oneof (9 variants):
Bind/Listener/Route/TCPRoute/Backend/
Policy/Workload/Service/RouteGroup" --> Ctrl classDef cp fill:#1e3a8a,stroke:#1e40af,color:#fff classDef dp fill:#7c2d12,stroke:#9a3412,color:#fff classDef ui fill:#14532d,stroke:#166534,color:#fff class CP,Watch,KRT,Trans,ADS,Deployer cp class DP,App,Lib,Ctrl,State,Proxy,Gw,Http,Tcp,Pool,HTTP,Tport,LLM,MCP,A2A,CEL,Tel,Mgmt dp class UI,UIH ui
原 ASCII 图
                  ┌──────────────────────────────────────────────┐
                  │  CONTROL PLANE  (Go, controller/)            │
                  │                                              │
   K8s API ──▶  ┌─┴────────┐   KRT collections (Istio reactive) │
                │ Watch     │   ┌─────────────┐                  │
                │ Gateway   │──▶│ Translator  │                  │
                │ HTTPRoute │   │ K8s → xDS   │                  │
                │ CRDs:     │   └──────┬──────┘                  │
                │  AGPol    │          │                         │
                │  AGBe     │   ┌──────▼──────┐    ┌──────────┐  │
                │  AGParams │   │ ADS Server  │◄──▶│ Deployer │  │
                └───────────┘   │ SotW+Delta  │    │ Helm/Pod │  │
                  ▲             └──────┬──────┘    └──────────┘  │
                  │                    │ gRPC                    │
                  │ kubectl/CRD        │ DeltaAggregatedResources│
                  └──────┐             │                         │
                         │             ▼                         │
   ┌────────────────────────────────────────────────────────────┘
   │                     │ resource.Resource (oneof 9):
   │                     │ Bind/Listener/Route/TCPRoute/Backend/
   │                     │ Policy/Workload/Service/RouteGroup
   │                     │
   │  ┌──────────────────▼────────────────────────────────────┐
   │  │ DATA PLANE (Rust, crates/agentgateway*)               │
   │  │                                                       │
   │  │  agentgateway-app/   ← thin binary, signal/tracing    │
   │  │       │                                               │
   │  │       └─▶ agentgateway/lib                            │
   │  │             ├─ control/   xds AdsClient (delta)       │
   │  │             ├─ state_manager + store/                 │
   │  │             │     Stores { binds, discovery, ads }    │
   │  │             ├─ proxy/                                 │
   │  │             │    ├─ gateway.rs  (TCP accept loop)     │
   │  │             │    ├─ httpproxy.rs (route + policy)     │
   │  │             │    ├─ tcpproxy.rs                       │
   │  │             │    └─ pool (connection)                 │
   │  │             ├─ http/      (filters, retry, transform) │
   │  │             ├─ transport/ (HBONE = mTLS HTTP/2)       │
   │  │             ├─ llm/       7 providers behind          │
   │  │             │             AIProvider enum + Provider  │
   │  │             │             trait, OpenAI-compat unify  │
   │  │             ├─ mcp/       App/Relay + upstream/       │
   │  │             │             stdio|HTTP|SSE|Streamable   │
   │  │             │             federation, CEL RBAC        │
   │  │             ├─ a2a/       classifier + URL rewriter   │
   │  │             ├─ cel/       crates/cel-fork + celx      │
   │  │             ├─ telemetry/ OTLP + access log           │
   │  │             └─ management/ admin + debug + ui         │
   │  │                                                       │
   │  └───────────────────────────────────────────────────────┘
   │
   │  ┌──────────────────────────────────────────────────────┐
   │  │ UI (TypeScript, ui/, Next.js)                        │
   │  │   /ui/ HTTP → management → live config view          │
   │  └──────────────────────────────────────────────────────┘

模块分层

Workspace 顶层

模块 语言 角色
crates/agentgateway-app/ Rust 薄壳二进制(信号 / 日志 / panic hook)
crates/agentgateway/ Rust 数据面主库,三协议网关都在此
crates/xds/ Rust xDS 客户端 + proto 绑定
crates/hbone/ Rust Istio HBONE(mTLS over HTTP/2 CONNECT)传输
crates/celx/ + crates/cel-fork/ Rust CEL 策略求值(fork 加 HTTP 集成)
controller/ Go controller-runtime + ADS server + Helm deployer
api/ Go resource.proto 的 Go 绑定
ui/ TypeScript Next.js 管理 UI

数据面分层(crates/agentgateway/src/

模块 职责
L1 配置 control/ config.rs 本地 yaml + xDS client + outbound client
L2 状态 state_manager.rs store/ 反应式 Store 聚合 binds/discovery/ads
L3 代理 proxy/gateway.rs proxy/httpproxy.rs TCP accept → TLS → route → upstream
L4 HTTP http/ filter chain / retry / timeout / transform
L5 协议 llm/ mcp/ a2a/ 三个 AI 协议网关
L6 横切 cel/ telemetry/ transport/ 策略求值 / 可观测 / HBONE 隧道
L7 管理 management/ ui.rs 管理 HTTP / 健康 / UI 静态资源

自家 CRD

CRD 短名 用途
AgentgatewayPolicy agpol CEL 策略容器,挂在 Gateway / Listener / Route 上
AgentgatewayBackend agbe 后端配置:AI / MCP / A2A / Static
AgentgatewayParameters GatewayClass 级 Pod 模板参数

上游 Gateway API 直接复用:Gateway HTTPRoute GRPCRoute TCPRoute TLSRoute GatewayClass ListenerSet InferencePool

关键数据流

一次 LLM 请求的端到端数据流

sequenceDiagram autonumber participant C as Client
(curl / OpenAI SDK) participant GW as proxy/gateway.rs
(TCP accept · TLS · HTTP framing) participant HP as proxy/httpproxy.rs
(route · filters · CEL · LB) participant LLM as llm/mod.rs
(AIProvider dispatch) participant CL as client/
(reqwest + hyper + HBONE) participant U as Upstream LLM
(api.openai.com / api.anthropic.com / …) C->>GW: POST /v1/chat/completions (model: "gpt-4") activate GW Note over GW: LazyConfigAcceptor → SNI
查 binds → Listener TLS
rustls 终止 → hyper H1/H2 GW->>HP: handed off deactivate GW activate HP Note over HP: 1. select_route_chain
2. 取 Route + Policy[] + Backend[]
3. header/retry/timeout/transform
4. CEL 授权 allow|deny
5. weighted_random_choice → Backend
6. Backend.kind = AI{provider} HP->>LLM: dispatch by AIProvider enum deactivate HP activate LLM Note over LLM: provider.transform_request()
openai · anthropic · gemini · bedrock · vertex · azure · copilot
guardrails (pre) — regex/moderation/model-armor/...
token-bucket · cost limit LLM->>CL: ProviderRequest deactivate LLM activate CL Note over CL: connection pool (crates/pool)
若 backend 是 mesh workload → HBONE tunnel
(mTLS + H2 CONNECT)
否则普通 TLS / plaintext CL->>U: provider-native HTTP deactivate CL activate U U-->>CL: SSE stream / chunked deactivate U activate CL CL-->>LLM: stream chunks deactivate CL activate LLM Note over LLM: provider.transform_response()
反向 schema → 统一 OpenAI 格式
guardrails (post)
telemetry: tokens_in/out · cost · latency
access log + OTLP span LLM-->>C: OpenAI-compatible SSE deactivate LLM
原 ASCII 图
Client (curl / OpenAI SDK)
  │  POST /v1/chat/completions   (model: "gpt-4")
  ▼
┌─────────────────────────────────────────────────────────────────┐
│ proxy/gateway.rs::run()                                         │
│    TCP accept (multi-thread tokio runtime)                      │
│       │                                                         │
│       ▼ LazyConfigAcceptor                                      │
│    Peek SNI → 查 binds 找到匹配的 Listener TLS 配置             │
│       │                                                         │
│       ▼                                                         │
│    rustls TLS termination                                       │
│       │                                                         │
│       ▼                                                         │
│    hyper HTTP/1.1 or H2 framing                                 │
└──────┬──────────────────────────────────────────────────────────┘
       │
       ▼
┌─────────────────────────────────────────────────────────────────┐
│ proxy/httpproxy.rs::proxy()                                     │
│  1. select_route_chain()  ← 按 path / header / method 匹配      │
│  2. 拿到 Route + Policy[] + Backend[]                           │
│  3. http/filters 执行:header 修改 / retry / timeout / transform│
│  4. CEL 授权(cel/ + agpol 引用的策略 IR)                      │
│     ─ 取请求 header/body/JWT claims → 求值 → allow|deny         │
│  5. weighted_random_choice 选 Backend                           │
│       │                                                         │
│       ▼                                                         │
│  6. 识别 Backend.kind = AI{provider: OpenAI|...}                │
└──────┬──────────────────────────────────────────────────────────┘
       │
       ▼
┌─────────────────────────────────────────────────────────────────┐
│ llm/mod.rs → dispatch by AIProvider enum                        │
│   provider.transform_request()                                  │
│     ├─ openai.rs   passthrough(已是 OpenAI 格式)              │
│     ├─ anthropic.rs OpenAI → Anthropic messages                 │
│     ├─ gemini.rs   OpenAI → Gemini generateContent              │
│     ├─ bedrock.rs  OpenAI → AWS SigV4 + Bedrock invoke          │
│     ├─ vertex.rs   gcp_auth + OpenAI → Vertex                   │
│     ├─ azure.rs    OpenAI → Azure OpenAI endpoint               │
│     └─ copilot.rs  GitHub Copilot                               │
│   guardrails (pre-flight)                                       │
│     regex / moderation / model-armor / content-safety / webhook │
│   token-bucket / cost limit                                     │
└──────┬──────────────────────────────────────────────────────────┘
       │
       ▼
┌─────────────────────────────────────────────────────────────────┐
│ client/  →  reqwest+hyper                                       │
│   ─ 走 connection pool (crates/pool)                            │
│   ─ 走 HBONE tunnel  ?  (若 backend 是 mesh workload,          │
│       transport/hbone:mTLS+H2 CONNECT 隧道)                    │
│   ─ 否则普通 TLS / plaintext                                    │
└──────┬──────────────────────────────────────────────────────────┘
       │
       ▼
   Upstream LLM provider (api.openai.com / api.anthropic.com / ...)
       │
       ▼ 响应(流式 SSE / chunked)
┌─────────────────────────────────────────────────────────────────┐
│ provider.transform_response()                                   │
│   ─ 反向 schema 转换 → 统一 OpenAI 格式                         │
│   ─ guardrails (post-flight)                                    │
│   ─ telemetry: tokens_in / tokens_out / cost / latency          │
│   ─ access log + OTLP span                                      │
└──────┬──────────────────────────────────────────────────────────┘
       │
       ▼
   Client receives OpenAI-compatible SSE stream

控制面 → 数据面 配置分发

flowchart TD K["kubectl apply
Gateway / HTTPRoute / agpol / agbe.yaml"] Watch["controller-runtime watch
→ KRT collection mutate"] Trans["Translator: K8s objs → resource.Resource oneof"] subgraph RES["resource.Resource (oneof 9 variants)"] direction TB R1[Bind — TCP listen addr] R2[Listener — TLS / SNI / protocol] R3[Route — HTTP route + filters] R4[TCPRoute] R5[Backend — AI / MCP / A2A / Service / Static] R6[Policy — CEL IR + transform + retry + ratelimit] R7[RouteGroup] R8[Workload / Service
Istio Address 兼容] end ADS["ADS Server (gRPC)
controller/pkg/xds/server.go
DeltaAggregatedDiscoveryService"] Client["agentgateway / control / AdsClient"] State["state_manager + Stores
{ binds, discovery }"] Rebuild["proxy/gateway
重建 Listener / Route 索引"] K --> Watch --> Trans --> RES --> ADS ADS -- "Delta xDS (增量、单连接)" --> Client Client -- apply --> State State -- 触发 --> Rebuild
原 ASCII 图
kubectl apply Gateway/HTTPRoute/agpol/agbe.yaml
      │
      ▼
controller-runtime watch → KRT collection mutate
      │
      ▼
Translator: K8s objs → resource.Resource oneof
   ├─ Bind          (TCP listen addr)
   ├─ Listener      (TLS / SNI / protocol)
   ├─ Route         (HTTP route + filters)
   ├─ TCPRoute
   ├─ Backend       (AI / MCP / A2A / Service / Static)
   ├─ Policy        (CEL IR + transform + retry + ratelimit)
   ├─ RouteGroup
   └─ Workload/Service (Istio Address 兼容)
      │
      ▼
ADS Server (gRPC, controller/pkg/xds/server.go)
   DeltaAggregatedDiscoveryService
      │
      ▼ Delta xDS(增量)
agentgateway/control/AdsClient
      │
      ▼ apply
state_manager + Stores { binds, discovery }
      │
      ▼ 触发
proxy/gateway 重建 Listener / Route 索引

xDS 资源类型只有两种:
- istio.workload.Address —— 兼容 Istio ambient mesh 的 workload/service 发现
- agentgateway.dev.resource.Resource —— 主资源 oneof(9 个变体覆盖整个数据面状态)

只用 Delta ADS,单连接复用,不每种资源一个 stream。

一次 MCP 工具调用

flowchart TD Client["MCP Client
(Cursor · Claude Desktop · OpenAI Responses API)"] Router["mcp/router.rs
按 session id 路由到对应上游"] Handler["mcp/handler.rs
工具列表 federation (merge_tools)
多个上游 tools/list 合并
+ CEL RBAC 过滤 (mcp/rbac.rs)"] subgraph UP["mcp/upstream/"] direction LR Stdio["stdio.rs"] SHTTP["streamablehttp.rs"] SSE["sse.rs"] OAPI["openapi.rs
(OpenAPI → MCP 自动桥接)"] end Server["Upstream MCP server
或 OpenAPI endpoint"] Client -- "JSON-RPC over stdio · HTTP Streamable · SSE" --> Router Router --> Handler Handler -- "tools/call" --> UP UP --> Server
原 ASCII 图
MCP Client (Cursor / Claude Desktop / OpenAI Responses API)
  │ JSON-RPC over stdio | HTTP Streamable | SSE
  ▼
mcp/router.rs  ── 路由层根据 session id 找到对应的上游
  │
  ▼
mcp/handler.rs  ── 工具列表 federation(merge_tools)
  │     │
  │     └─ 多个 upstream MCP server 的 tools/list 合并
  │        + CEL RBAC 过滤(mcp/rbac.rs)
  │
  ▼ tools/call
mcp/upstream/{stdio.rs | streamablehttp.rs | sse.rs | openapi.rs}
  │
  ▼
Upstream MCP server / OpenAPI endpoint

下游 transport:HTTP Streamable / SSE / stdio。上游 transport:同三种 + openapi.rs 把任何 OpenAPI spec 自动桥接成 MCP tools(无需上游侧改造)。federation 通过 handler.rs::merge_tools() 合并多个 MCP server 的工具集,命名空间用 __ 前缀避冲突,CEL RBAC 按 client identity 过滤可见性。

A2A(Agent-to-Agent)

flowchart TD Client["Agent Client"] A2A["a2a/mod.rs
拦截 Agent Card 发现请求
透传上游,重写响应 URL
把 upstream 真实地址 → gateway 自身地址"] Up["Upstream Agent"] Followup["之后所有 Agent 调用
都走回 gateway
→ 自然拿到 policy / observability"] Client -- "HTTP GET /.well-known/agent-card.json" --> A2A A2A --> Up A2A --> Followup
原 ASCII 图
Agent Client
  │ HTTP GET /.well-known/agent-card.json
  ▼
a2a/mod.rs  ── 拦截 Agent Card 发现请求
  │  ─ 透传上游,但 *重写响应里的 URL*
  │    把 upstream 真实地址换成 gateway 自身地址
  │  ─ 之后所有 Agent 调用都走回 gateway,自然拿到 policy/observability

A2A 是三协议里最轻的——核心就 URL 重写。Google A2A 协议把入口放在 Agent Card well-known endpoint,gateway 控制了发现,就控制了所有后续调用。

设计决策与哲学

关键组件深入:LLM Provider 适配器

crates/agentgateway/src/llm/ 是数据面最厚的协议适配层。统一抽象骨架:

enum AIProvider {
    OpenAI, Anthropic, Gemini, Bedrock, Vertex, Azure, Copilot,
}

trait Provider {
    async fn transform_request(&self, req: OpenAIRequest) -> ProviderRequest;
    async fn transform_response(&self, resp: ProviderResponse) -> OpenAIResponse;
    async fn stream_chunk(&self, chunk: Bytes) -> OpenAIChunk;
}

外部统一暴露 OpenAI 兼容 API/v1/chat/completions / /v1/embeddings),内部 dispatch 到 provider 双向翻译。

Guardrails —— LLM 流量特有的内容审查,pre-flight / post-flight 双向,支持 regex / OpenAI Moderation / AWS Bedrock Guardrails / Google Model Armor / Azure Content Safety / 自定义 webhook,每 backend 按声明顺序串成 chain。

负载均衡 —— 跨副本走 two-random-choice(tail-latency 比 round-robin 稳);跨 model alias 走 weighted random。

HiClaw / agent-sandbox 的关系

flowchart LR subgraph Platform["AI Platform Day 0"] direction LR Sandbox["Sandbox (gVisor pod)
Agent code runs here
NetworkPolicy 拒绝直连云元数据"] AGW["agentgateway
route LLM / MCP / A2A
· CEL policy
· Guardrails
· Credential inject"] Sandbox --> AGW end Upstream["api.openai.com
api.anthropic.com
…"] AGW --> Upstream classDef sb fill:#14532d,stroke:#166534,color:#fff classDef gw fill:#7c2d12,stroke:#9a3412,color:#fff class Sandbox sb class AGW gw
原 ASCII 图
┌──────────────────────────────────────────────────────┐
│  AI Platform Day 0                                   │
│                                                      │
│  ┌─────────────────────┐    ┌──────────────────────┐ │
│  │ Sandbox             │    │ agentgateway         │ │
│  │ (gVisor pod)        │───▶│ (route LLM/MCP/A2A)  │ │
│  │  Agent code runs    │    │  ─ CEL policy        │ │
│  │  here, NetworkPolicy│    │  ─ Guardrails        │ │
│  │  denies直连云元数据 │    │  ─ Credential inject │ │
│  └─────────────────────┘    └──────────┬───────────┘ │
│                                        │             │
└────────────────────────────────────────┼─────────────┘
                                         ▼
                            api.openai.com / api.anthropic.com / ...

相关页面