Portkey AI Gateway
Portkey is an AI gateway that provides a unified API across 200+ language models with built-in fallbacks, load balancing, semantic caching, and an observability dashboard. It's available as a managed cloud service or as a self-hosted open-source deployment.
Portkey sits between your application and the AI providers. Your app calls Portkey once; Portkey handles provider routing, retries, caching, and logging. You get a single integration point regardless of how many AI providers you use.
Overview
| Property | Details |
|---|---|
| Type | AI gateway (cloud or self-hosted) |
| Open source | Yes (gateway core on GitHub) |
| Supported providers | 200+ (OpenAI, Anthropic, Mistral, Cohere, Google, AWS, Azure, and more) |
| Pricing | Free tier available; paid plans for higher volume |
| Self-hosted | Yes (Docker, Kubernetes) |
| Best for | Teams wanting observability + reliability across multiple AI providers |
Key Features
Unified API
Portkey exposes a single API that wraps all providers with an OpenAI-compatible interface. Change your baseURL and apiKey, and your existing OpenAI SDK code works with any provider:
import Portkey from 'portkey-ai';
const portkey = new Portkey({
apiKey: 'PORTKEY_API_KEY',
virtualKey: 'ANTHROPIC_VIRTUAL_KEY', // your encrypted Anthropic key
});
// Same OpenAI-style API, now routing through Anthropic
const completion = await portkey.chat.completions.create({
messages: [{ role: 'user', content: 'What is the capital of France?' }],
model: 'claude-3-5-sonnet-20241022',
});
You can also use the OpenAI SDK directly by overriding baseURL:
import OpenAI from 'openai';
const client = new OpenAI({
apiKey: 'PORTKEY_API_KEY',
baseURL: 'https://api.portkey.ai/v1',
defaultHeaders: {
'x-portkey-virtual-key': 'ANTHROPIC_VIRTUAL_KEY',
},
});
Virtual Keys
Virtual keys are encrypted references to your actual provider API keys. Store your real keys in Portkey's vault; reference them by a virtual key alias in your code.
Benefits:
- Key rotation without code changes: Update the real key in Portkey; your code keeps using the same virtual key
- Scoped access: Share virtual keys with team members without exposing raw credentials
- Automatic redaction: Real keys never appear in logs or request traces
- Spend limits: Set per-virtual-key spend caps to prevent runaway costs
Create virtual keys in the Portkey dashboard or via API, then reference them in headers:
const portkey = new Portkey({
apiKey: 'PORTKEY_API_KEY',
virtualKey: 'openai-prod-key', // your alias, not the real key
});
Fallbacks
Configure automatic failover when a provider returns an error or is unavailable. If the primary model fails, Portkey retries with the next provider in the list — transparently, without changes to your application code.
import { createConfig } from 'portkey-ai';
const config = createConfig({
strategy: { mode: 'fallback' },
targets: [
{ virtualKey: 'openai-key', overrideParams: { model: 'gpt-4o' } },
{ virtualKey: 'anthropic-key', overrideParams: { model: 'claude-3-5-sonnet-20241022' } },
{ virtualKey: 'mistral-key', overrideParams: { model: 'mistral-large-latest' } },
],
});
const portkey = new Portkey({ apiKey: 'PORTKEY_API_KEY', config });
Fallback triggers: HTTP errors (5xx), timeout, rate limit (429), or custom status codes you specify.
Load Balancing
Distribute traffic across multiple API keys or providers with configurable weights. Useful for staying under rate limits or managing cost by routing cheaper requests to lower-cost models.
const config = createConfig({
strategy: {
mode: 'loadbalance',
},
targets: [
{ virtualKey: 'openai-key-1', weight: 0.5 },
{ virtualKey: 'openai-key-2', weight: 0.3 },
{ virtualKey: 'anthropic-key', weight: 0.2 },
],
});
Weight values are proportional — the example above sends 50% of traffic to openai-key-1, 30% to openai-key-2, and 20% to Anthropic.
Semantic Caching
Portkey can cache responses based on semantic similarity, not just exact prompt matches. If two prompts ask essentially the same question, the second gets the cached response — reducing latency and cost.
Enable caching in your config:
const config = createConfig({
cache: {
mode: 'semantic', // 'simple' for exact-match only
maxAge: 3600, // cache TTL in seconds
similarityThreshold: 0.9, // 0-1, higher = stricter matching
},
targets: [{ virtualKey: 'openai-key' }],
});
Cache hit rates depend on your workload. High-volume applications with repetitive queries (support chatbots, FAQ systems) see the most benefit — often 20–40% cost reduction.
Observability Dashboard
Every request routed through Portkey is logged with full metadata: model, tokens, latency, cost, virtual key, status, and the full request/response body (if enabled).
The dashboard provides:
- Request logs: Searchable history with request/response inspection
- Cost tracking: Per-provider, per-virtual-key, and per-metadata-tag spend
- Latency analytics: p50/p90/p99 by model and provider
- Error rate monitoring: Track failures, fallback triggers, and cache hits
- Token usage: Input vs. output token breakdown over time
You can also attach custom metadata to requests for filtering:
const portkey = new Portkey({
apiKey: 'PORTKEY_API_KEY',
virtualKey: 'openai-key',
metadata: {
environment: 'production',
userId: 'user-123',
feature: 'summarization',
},
});
Filter logs by any metadata field in the dashboard.
Guardrails
Portkey Guardrails validates and transforms inputs and outputs before and after model calls. Available checks include:
- Content moderation: Block unsafe content using built-in classifiers or provider moderation APIs
- PII detection: Detect and redact personal information
- Regex filters: Match or block patterns in inputs and outputs
- JSON schema validation: Ensure structured outputs conform to a schema
- Prompt injection detection: Block attempts to override system prompts
Configure guardrails in the dashboard and reference them in your config — no SDK changes needed.
Prompt Management
Portkey includes a prompt library where you can version, test, and deploy prompts independently of your application code. Teams can iterate on prompts in the dashboard without a code deploy.
// Reference a managed prompt by ID
const response = await portkey.prompts.completions.create({
promptID: 'pp-my-summarizer-v2',
variables: { text: 'The document content here...' },
});
Prompts support variables, versioning, and A/B testing — useful when prompt quality is critical to your product.
Supported Providers
Portkey supports 200+ providers through a combination of native integrations and OpenAI-compatible pass-through. A sample:
| Category | Providers |
|---|---|
| Frontier models | OpenAI, Anthropic, Google Gemini, Mistral |
| Cloud AI | AWS Bedrock, Azure OpenAI, Google Vertex AI |
| Open models | Together AI, Fireworks, Groq, Perplexity |
| Embeddings | OpenAI, Cohere, Voyage AI |
| Image generation | OpenAI DALL·E, Stability AI |
| Self-hosted | Ollama, vLLM, custom OpenAI-compatible endpoints |
Self-Hosting
The Portkey gateway is open source and can be self-hosted. This is useful for air-gapped environments, compliance requirements, or teams that want full control over request data.
# Run with Docker
docker run -d \
-p 8787:8787 \
-e PORTKEY_API_KEY=your-api-key \
portkeyai/gateway:latest
# Or with Docker Compose
version: '3.8'
services:
portkey-gateway:
image: portkeyai/gateway:latest
ports:
- "8787:8787"
environment:
- NODE_ENV=production
restart: unless-stopped
Then point your SDK at the self-hosted instance:
const portkey = new Portkey({
apiKey: 'PORTKEY_API_KEY',
baseURL: 'http://localhost:8787/v1',
virtualKey: 'openai-key',
});
The self-hosted gateway handles routing, fallbacks, and load balancing. The observability dashboard remains a Portkey-hosted SaaS feature unless you bring your own logging infrastructure.
Pricing
| Plan | Price | Requests/month | Features |
|---|---|---|---|
| Free | $0 | 10,000 | Logs (7-day retention), virtual keys, basic cache |
| Developer | $49/mo | 100,000 | Logs (30-day retention), guardrails, A/B testing |
| Production | $199/mo | 1,000,000 | Logs (90-day retention), SSO, SLA |
| Enterprise | Custom | Unlimited | Custom retention, self-hosted support, dedicated CSM |
Caching, fallbacks, and load balancing are available on all plans. Guardrails require Developer or above.
Installation
# Node.js / TypeScript
npm install portkey-ai
# Python
pip install portkey-ai
Quick setup:
import Portkey from 'portkey-ai';
const portkey = new Portkey({
apiKey: process.env.PORTKEY_API_KEY,
virtualKey: process.env.OPENAI_VIRTUAL_KEY,
});
const response = await portkey.chat.completions.create({
model: 'gpt-4o',
messages: [{ role: 'user', content: 'Hello' }],
});
console.log(response.choices[0].message.content);
When to Choose Portkey
Portkey is a strong fit when:
- You use multiple AI providers and want unified observability
- You need automatic fallbacks for production reliability
- Cost reduction via caching matters for your workload
- Your team wants a no-code way to manage prompts and virtual keys
- You want an open-source option you can self-host
Consider alternatives when:
- You're deep in the AWS ecosystem and want IAM-based access control (use AWS Bedrock)
- You only use a single provider and don't need routing logic
- You need a Vercel or Cloudflare edge-native solution (Vercel AI Gateway, Cloudflare AI Gateway)