Cloud & Hosting

Portkey AI Gateway

Portkey is an AI gateway that provides a unified API across 200+ language models with built-in fallbacks, load balancing, semantic caching, and an observabil...

Portkey AI Gateway

Portkey is an AI gateway that provides a unified API across 200+ language models with built-in fallbacks, load balancing, semantic caching, and an observability dashboard. It's available as a managed cloud service or as a self-hosted open-source deployment.

Portkey sits between your application and the AI providers. Your app calls Portkey once; Portkey handles provider routing, retries, caching, and logging. You get a single integration point regardless of how many AI providers you use.

Overview

Property Details
Type AI gateway (cloud or self-hosted)
Open source Yes (gateway core on GitHub)
Supported providers 200+ (OpenAI, Anthropic, Mistral, Cohere, Google, AWS, Azure, and more)
Pricing Free tier available; paid plans for higher volume
Self-hosted Yes (Docker, Kubernetes)
Best for Teams wanting observability + reliability across multiple AI providers

Key Features

Unified API

Portkey exposes a single API that wraps all providers with an OpenAI-compatible interface. Change your baseURL and apiKey, and your existing OpenAI SDK code works with any provider:

import Portkey from 'portkey-ai';

const portkey = new Portkey({
  apiKey: 'PORTKEY_API_KEY',
  virtualKey: 'ANTHROPIC_VIRTUAL_KEY', // your encrypted Anthropic key
});

// Same OpenAI-style API, now routing through Anthropic
const completion = await portkey.chat.completions.create({
  messages: [{ role: 'user', content: 'What is the capital of France?' }],
  model: 'claude-3-5-sonnet-20241022',
});

You can also use the OpenAI SDK directly by overriding baseURL:

import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: 'PORTKEY_API_KEY',
  baseURL: 'https://api.portkey.ai/v1',
  defaultHeaders: {
    'x-portkey-virtual-key': 'ANTHROPIC_VIRTUAL_KEY',
  },
});

Virtual Keys

Virtual keys are encrypted references to your actual provider API keys. Store your real keys in Portkey's vault; reference them by a virtual key alias in your code.

Benefits:

  • Key rotation without code changes: Update the real key in Portkey; your code keeps using the same virtual key
  • Scoped access: Share virtual keys with team members without exposing raw credentials
  • Automatic redaction: Real keys never appear in logs or request traces
  • Spend limits: Set per-virtual-key spend caps to prevent runaway costs

Create virtual keys in the Portkey dashboard or via API, then reference them in headers:

const portkey = new Portkey({
  apiKey: 'PORTKEY_API_KEY',
  virtualKey: 'openai-prod-key', // your alias, not the real key
});

Fallbacks

Configure automatic failover when a provider returns an error or is unavailable. If the primary model fails, Portkey retries with the next provider in the list — transparently, without changes to your application code.

import { createConfig } from 'portkey-ai';

const config = createConfig({
  strategy: { mode: 'fallback' },
  targets: [
    { virtualKey: 'openai-key', overrideParams: { model: 'gpt-4o' } },
    { virtualKey: 'anthropic-key', overrideParams: { model: 'claude-3-5-sonnet-20241022' } },
    { virtualKey: 'mistral-key', overrideParams: { model: 'mistral-large-latest' } },
  ],
});

const portkey = new Portkey({ apiKey: 'PORTKEY_API_KEY', config });

Fallback triggers: HTTP errors (5xx), timeout, rate limit (429), or custom status codes you specify.

Load Balancing

Distribute traffic across multiple API keys or providers with configurable weights. Useful for staying under rate limits or managing cost by routing cheaper requests to lower-cost models.

const config = createConfig({
  strategy: {
    mode: 'loadbalance',
  },
  targets: [
    { virtualKey: 'openai-key-1', weight: 0.5 },
    { virtualKey: 'openai-key-2', weight: 0.3 },
    { virtualKey: 'anthropic-key', weight: 0.2 },
  ],
});

Weight values are proportional — the example above sends 50% of traffic to openai-key-1, 30% to openai-key-2, and 20% to Anthropic.

Semantic Caching

Portkey can cache responses based on semantic similarity, not just exact prompt matches. If two prompts ask essentially the same question, the second gets the cached response — reducing latency and cost.

Enable caching in your config:

const config = createConfig({
  cache: {
    mode: 'semantic',         // 'simple' for exact-match only
    maxAge: 3600,             // cache TTL in seconds
    similarityThreshold: 0.9, // 0-1, higher = stricter matching
  },
  targets: [{ virtualKey: 'openai-key' }],
});

Cache hit rates depend on your workload. High-volume applications with repetitive queries (support chatbots, FAQ systems) see the most benefit — often 20–40% cost reduction.

Observability Dashboard

Every request routed through Portkey is logged with full metadata: model, tokens, latency, cost, virtual key, status, and the full request/response body (if enabled).

The dashboard provides:

  • Request logs: Searchable history with request/response inspection
  • Cost tracking: Per-provider, per-virtual-key, and per-metadata-tag spend
  • Latency analytics: p50/p90/p99 by model and provider
  • Error rate monitoring: Track failures, fallback triggers, and cache hits
  • Token usage: Input vs. output token breakdown over time

You can also attach custom metadata to requests for filtering:

const portkey = new Portkey({
  apiKey: 'PORTKEY_API_KEY',
  virtualKey: 'openai-key',
  metadata: {
    environment: 'production',
    userId: 'user-123',
    feature: 'summarization',
  },
});

Filter logs by any metadata field in the dashboard.

Guardrails

Portkey Guardrails validates and transforms inputs and outputs before and after model calls. Available checks include:

  • Content moderation: Block unsafe content using built-in classifiers or provider moderation APIs
  • PII detection: Detect and redact personal information
  • Regex filters: Match or block patterns in inputs and outputs
  • JSON schema validation: Ensure structured outputs conform to a schema
  • Prompt injection detection: Block attempts to override system prompts

Configure guardrails in the dashboard and reference them in your config — no SDK changes needed.

Prompt Management

Portkey includes a prompt library where you can version, test, and deploy prompts independently of your application code. Teams can iterate on prompts in the dashboard without a code deploy.

// Reference a managed prompt by ID
const response = await portkey.prompts.completions.create({
  promptID: 'pp-my-summarizer-v2',
  variables: { text: 'The document content here...' },
});

Prompts support variables, versioning, and A/B testing — useful when prompt quality is critical to your product.

Supported Providers

Portkey supports 200+ providers through a combination of native integrations and OpenAI-compatible pass-through. A sample:

Category Providers
Frontier models OpenAI, Anthropic, Google Gemini, Mistral
Cloud AI AWS Bedrock, Azure OpenAI, Google Vertex AI
Open models Together AI, Fireworks, Groq, Perplexity
Embeddings OpenAI, Cohere, Voyage AI
Image generation OpenAI DALL·E, Stability AI
Self-hosted Ollama, vLLM, custom OpenAI-compatible endpoints

Self-Hosting

The Portkey gateway is open source and can be self-hosted. This is useful for air-gapped environments, compliance requirements, or teams that want full control over request data.

# Run with Docker
docker run -d \
  -p 8787:8787 \
  -e PORTKEY_API_KEY=your-api-key \
  portkeyai/gateway:latest
# Or with Docker Compose
version: '3.8'
services:
  portkey-gateway:
    image: portkeyai/gateway:latest
    ports:
      - "8787:8787"
    environment:
      - NODE_ENV=production
    restart: unless-stopped

Then point your SDK at the self-hosted instance:

const portkey = new Portkey({
  apiKey: 'PORTKEY_API_KEY',
  baseURL: 'http://localhost:8787/v1',
  virtualKey: 'openai-key',
});

The self-hosted gateway handles routing, fallbacks, and load balancing. The observability dashboard remains a Portkey-hosted SaaS feature unless you bring your own logging infrastructure.

Pricing

Plan Price Requests/month Features
Free $0 10,000 Logs (7-day retention), virtual keys, basic cache
Developer $49/mo 100,000 Logs (30-day retention), guardrails, A/B testing
Production $199/mo 1,000,000 Logs (90-day retention), SSO, SLA
Enterprise Custom Unlimited Custom retention, self-hosted support, dedicated CSM

Caching, fallbacks, and load balancing are available on all plans. Guardrails require Developer or above.

Installation

# Node.js / TypeScript
npm install portkey-ai

# Python
pip install portkey-ai

Quick setup:

import Portkey from 'portkey-ai';

const portkey = new Portkey({
  apiKey: process.env.PORTKEY_API_KEY,
  virtualKey: process.env.OPENAI_VIRTUAL_KEY,
});

const response = await portkey.chat.completions.create({
  model: 'gpt-4o',
  messages: [{ role: 'user', content: 'Hello' }],
});

console.log(response.choices[0].message.content);

When to Choose Portkey

Portkey is a strong fit when:

  • You use multiple AI providers and want unified observability
  • You need automatic fallbacks for production reliability
  • Cost reduction via caching matters for your workload
  • Your team wants a no-code way to manage prompts and virtual keys
  • You want an open-source option you can self-host

Consider alternatives when:

  • You're deep in the AWS ecosystem and want IAM-based access control (use AWS Bedrock)
  • You only use a single provider and don't need routing logic
  • You need a Vercel or Cloudflare edge-native solution (Vercel AI Gateway, Cloudflare AI Gateway)

Related Resources

Ready to build?

Go from idea to launched product in a week with AI-assisted development.