Comparing AI Gateways

AI gateways sit between your application and the AI providers — handling authentication, routing, caching, fallbacks, and observability. The right gateway depends on your infrastructure, team size, and priorities.

This guide compares the major options: Vercel AI Gateway, Cloudflare AI Gateway, OpenRouter, AWS Bedrock, Portkey, and notable open-source alternatives.

What Is an AI Gateway?

An AI gateway is a reverse proxy for AI model APIs. Instead of calling OpenAI, Anthropic, or Mistral directly, you call the gateway — which forwards the request, handles errors, caches responses, logs everything, and routes traffic based on your configuration.

Key benefits:

Single integration point: One SDK, one API key, many providers
Reliability: Automatic fallback when a provider is down or slow
Cost control: Semantic caching reduces duplicate API calls
Observability: Centralized logs, latency tracking, and spend analytics
Security: Your app never holds raw provider API keys

Feature Comparison

Feature	Vercel AI Gateway	Cloudflare AI Gateway	OpenRouter	AWS Bedrock	Portkey	LiteLLM
Hosted service	✅	✅	✅	✅	✅	Self-hosted
Self-hosted	❌	❌	❌	❌	✅	✅
Open source	❌	❌	❌	❌	✅ (gateway)	✅
Semantic caching	❌	❌	❌	❌	✅	✅
Exact-match caching	✅	✅	❌	❌	✅	✅
Fallbacks / routing	❌	❌	✅	❌	✅	✅
Load balancing	❌	❌	✅	❌	✅	✅
Observability dashboard	✅	✅	✅	CloudWatch	✅	✅
Spend analytics	✅	✅	✅	✅	✅	✅
Fine-tuning	❌	❌	❌	✅	❌	❌
RAG / Knowledge bases	❌	❌	❌	✅	❌	❌
IAM / SSO access control	Vercel teams	Cloudflare	❌	AWS IAM	✅	❌
VPC / private networking	❌	❌	❌	✅	❌	Self-hosted
Guardrails / content filters	❌	❌	❌	✅	✅	✅
Free tier	✅	✅	✅	❌	✅	✅

Supported Providers

Gateway	Providers
Vercel AI Gateway	OpenAI, Anthropic, xAI, Google, Perplexity, Fireworks, Together AI, Groq, Azure OpenAI
Cloudflare AI Gateway	OpenAI, Anthropic, Mistral, Cohere, Google, Perplexity, Groq, AWS Bedrock, Azure OpenAI, and more
OpenRouter	200+ models from OpenAI, Anthropic, Meta, Mistral, Google, and open-weight models
AWS Bedrock	Anthropic, Meta, Mistral, Cohere, AI21 Labs, Amazon Titan, Stability AI
Portkey	200+ providers including all major cloud AI services and self-hosted endpoints
LiteLLM	100+ providers with OpenAI-compatible pass-through

Pricing Comparison

Gateway	Base price	Caching savings	Volume discounts
Vercel AI Gateway	Included in Vercel plan	Yes (exact match)	With Vercel plan
Cloudflare AI Gateway	Free (within Workers limits)	Yes (exact match)	Cloudflare plans
OpenRouter	Free gateway, pay model prices	No	No (pass-through)
AWS Bedrock	Pay-per-token (no gateway fee)	No	Provisioned throughput
Portkey	Free tier; $49–$199/mo paid	Yes (semantic + exact)	Enterprise plans
LiteLLM	Free (self-hosted)	Yes	N/A (you host it)

OpenRouter passes provider costs to you at or slightly above list price. Vercel and Cloudflare gateways are add-ons to their existing platforms. Portkey's paid tiers unlock guardrails and longer log retention.

Gateway Profiles

Vercel AI Gateway

Best for teams already on Vercel. Tight integration with the Vercel AI SDK — pass gateway() in your config and observability is automatic. No separate account or API key. Caching is exact-match only. Does not support fallbacks or load balancing across providers.

import { gateway } from '@vercel/ai-sdk-gateway';
import { generateText } from 'ai';

const { text } = await generateText({
  model: gateway('openai/gpt-4o'),
  prompt: 'Hello',
});

Strengths: Zero-setup for Vercel users, integrated spend dashboard Limitations: No fallbacks, no semantic caching, Vercel-only

Cloudflare AI Gateway

Best for teams building on Cloudflare Workers or Pages. The gateway runs at Cloudflare's edge — low latency, global distribution, and no additional infrastructure. Supports a wide range of providers with exact-match caching and rate limiting.

const response = await fetch(
  'https://gateway.ai.cloudflare.com/v1/{account-id}/{gateway-id}/openai/chat/completions',
  {
    method: 'POST',
    headers: {
      'Authorization': `Bearer ${OPENAI_API_KEY}`,
      'Content-Type': 'application/json',
    },
    body: JSON.stringify({
      model: 'gpt-4o',
      messages: [{ role: 'user', content: 'Hello' }],
    }),
  }
);

Strengths: Edge-native, free tier, easy to set up Limitations: No fallbacks, no semantic caching, no routing logic

OpenRouter

OpenRouter is a marketplace-style gateway focused on model access and cost. It provides a unified OpenAI-compatible API across 200+ models — including many open-weight models not available elsewhere — with automatic routing to the cheapest or fastest available provider for each model.

import OpenAI from 'openai';

const client = new OpenAI({
  baseURL: 'https://openrouter.ai/api/v1',
  apiKey: process.env.OPENROUTER_API_KEY,
});

const response = await client.chat.completions.create({
  model: 'anthropic/claude-3.5-sonnet',
  messages: [{ role: 'user', content: 'Hello' }],
});

Strengths: Widest model selection, competitive pricing, fallback routing built in Limitations: No caching, limited observability, less enterprise-focused

AWS Bedrock

Bedrock is the best choice when your team is already on AWS and needs AI within the AWS security boundary. It's not a "gateway" in the traditional sense — it's a managed AWS service — but it plays the same role: multi-provider model access through a single API.

IAM integration is the defining feature. You control model access with the same IAM roles and policies you use for every other AWS service. Inference traffic can stay inside your VPC.

Strengths: IAM access control, VPC isolation, fine-tuning, knowledge bases Limitations: AWS-only, no semantic caching, no cross-cloud routing

→ See AWS Bedrock as an AI Gateway

Portkey

Portkey is the most feature-complete managed gateway. It offers semantic caching (not just exact-match), automatic fallbacks, load balancing, granular observability, virtual key management, prompt versioning, and guardrails — all in one product.

It's also the only major hosted gateway with a fully open-source core (self-hostable).

Strengths: Semantic caching, fallbacks, virtual keys, self-hosting option Limitations: Paid plans required for guardrails and longer log retention

→ See Portkey AI Gateway

LiteLLM

LiteLLM is a self-hosted Python proxy that exposes an OpenAI-compatible API across 100+ providers. It's the standard choice for ML engineering teams that want full control and are comfortable running infrastructure.

pip install litellm[proxy]
litellm --model anthropic/claude-3-5-sonnet-20241022 --port 8000

Strengths: Full open source, no vendor lock-in, excellent Python ecosystem Limitations: Requires infrastructure management, no hosted observability dashboard

Helicone

Helicone focuses on observability rather than routing. It's a logging proxy — you route requests through Helicone and it captures logs, costs, latency, and user sessions. It doesn't do caching, fallbacks, or multi-provider routing natively.

Strengths: Best-in-class logging UX, session tracking, prompt versioning Limitations: Observability only — not a full gateway

Decision Guide

For startups

Start with Vercel AI Gateway or Cloudflare AI Gateway if you're on those platforms — zero setup cost, good enough for most early-stage use cases.

Use OpenRouter if you want to experiment with many models cheaply without setting up accounts at each provider.

Add Portkey once you need production reliability (fallbacks + caching) or when observability becomes critical.

For cost-sensitive workloads

Portkey with semantic caching typically delivers the highest cost savings on workloads with repetitive prompts (support bots, FAQ systems, summarization pipelines). 20–40% cache hit rates are common.

OpenRouter can reduce costs by routing to the cheapest provider for each request automatically.

For enterprise and compliance

AWS Bedrock if you're on AWS and need VPC isolation, IAM access control, or data residency guarantees.

Portkey (self-hosted) if you need a full gateway stack but cannot send data outside your own infrastructure.

Cloudflare AI Gateway if you're on Cloudflare Enterprise and need edge-native performance with SOC 2 compliance.

For ML/platform teams

LiteLLM is the most flexible option for teams that run their own infrastructure and want full control. It integrates with LangChain, LlamaIndex, and most Python ML tooling.

Summary

Use Case	Recommended Gateway
Already on Vercel	Vercel AI Gateway
Already on Cloudflare Workers	Cloudflare AI Gateway
Maximum model variety, low setup	OpenRouter
AWS + compliance requirements	AWS Bedrock
Full-featured managed gateway	Portkey
Self-hosted, Python-first	LiteLLM
Observability-first (single provider)	Helicone
Maximum control, no vendor lock-in	LiteLLM or Portkey (self-hosted)

Comparing AI Gateways

Comparing AI Gateways

What Is an AI Gateway?

Feature Comparison

Supported Providers

Pricing Comparison

Gateway Profiles

Vercel AI Gateway

Cloudflare AI Gateway

OpenRouter

AWS Bedrock

Portkey

LiteLLM

Helicone

Decision Guide

For startups

For cost-sensitive workloads

For enterprise and compliance

For ML/platform teams

Summary

Related Resources

Related Topics in Cloud & Hosting