Cloud & Hosting

AWS Bedrock as an AI Gateway

AWS Bedrock functions as a managed AI gateway within the AWS ecosystem, giving you unified API access to foundation models from Anthropic, Meta, Mistral, Coh...

AWS Bedrock as an AI Gateway

AWS Bedrock functions as a managed AI gateway within the AWS ecosystem, giving you unified API access to foundation models from Anthropic, Meta, Mistral, Cohere, AI21 Labs, Stability AI, and Amazon itself — all under a single AWS account, IAM policy, and billing relationship.

Unlike dedicated AI gateways (Vercel, Cloudflare, OpenRouter), Bedrock is deeply integrated with the AWS platform. It trades portability for security, compliance, and AWS-native tooling: VPC isolation, CloudWatch logging, AWS PrivateLink, and S3-backed knowledge bases.

Overview

Property Details
Type Managed AI gateway (AWS-native)
Hosting AWS cloud (fully managed)
Self-hosted option No
Pricing model Pay-per-token (input + output)
Free tier No (pay-per-use from the start)
Best for AWS-first organizations, compliance-heavy workloads

Key Features

Multi-Model Access

Bedrock provides a single API endpoint and SDK for all supported models. You switch models by changing a model ID string — no API key rotation, no new SDK dependency.

import { bedrock } from '@ai-sdk/amazon-bedrock';
import { generateText } from 'ai';

// Switch between providers by changing the model ID
const claudeResult = await generateText({
  model: bedrock('anthropic.claude-3-5-sonnet-20241022-v2:0'),
  prompt: 'Summarize this document.',
});

const llamaResult = await generateText({
  model: bedrock('meta.llama3-1-70b-instruct-v1:0'),
  prompt: 'Summarize this document.',
});

Supported Providers and Models

Provider Models Available
Anthropic Claude 3.5 Sonnet, Claude 3 Opus, Claude 3 Haiku, Claude 3.5 Haiku
Meta Llama 3.1 (8B, 70B, 405B), Llama 3.2 (1B, 3B, 11B, 90B)
Mistral Mistral Large, Mistral Small, Mixtral 8x7B
Cohere Command R, Command R+, Command Light, Embed
AI21 Labs Jamba-Instruct
Amazon Titan Text (Express, Lite, Premier), Titan Embeddings
Stability AI Stable Diffusion (image generation)

Model availability varies by AWS region. Some models require explicit access requests through the Bedrock console.

IAM Integration

Every Bedrock request is authenticated via AWS IAM — no separate API keys to manage. Your existing IAM roles, policies, and permission boundaries apply directly.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "bedrock:InvokeModel",
        "bedrock:InvokeModelWithResponseStream"
      ],
      "Resource": [
        "arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-5-sonnet-20241022-v2:0"
      ]
    }
  ]
}

You can scope IAM policies to specific models, regions, or accounts — giving you fine-grained access control that dedicated gateways don't offer out of the box.

Guardrails

Bedrock Guardrails lets you add content filtering, PII detection, and topic blocking to any model — including models that don't have native safety filters.

Configure guardrails through the AWS console, then reference them in API calls:

import { BedrockRuntimeClient, InvokeModelCommand } from '@aws-sdk/client-bedrock-runtime';

const client = new BedrockRuntimeClient({ region: 'us-east-1' });

const response = await client.send(new InvokeModelCommand({
  modelId: 'anthropic.claude-3-5-sonnet-20241022-v2:0',
  guardrailIdentifier: 'my-guardrail-id',
  guardrailVersion: '1',
  body: JSON.stringify({
    messages: [{ role: 'user', content: 'Hello' }],
    max_tokens: 1000,
    anthropic_version: 'bedrock-2023-05-31',
  }),
}));

Guardrail capabilities:

  • Content filters: Block hate speech, violence, sexual content, insults, and prompt injection
  • Denied topics: Block specific subject areas (e.g., competitors, legal advice)
  • Word filters: Block or mask specific phrases
  • PII detection: Detect and redact personal data (SSN, credit card, email, etc.)
  • Grounding checks: Detect hallucinations by comparing responses to source documents

Knowledge Bases

Bedrock Knowledge Bases adds retrieval-augmented generation (RAG) to any model. You point Bedrock at an S3 bucket, it chunks and embeds your documents, stores them in a vector store (Amazon OpenSearch, Pinecone, or Aurora PostgreSQL), and automatically retrieves relevant context at inference time.

import { BedrockAgentRuntimeClient, RetrieveAndGenerateCommand } from '@aws-sdk/client-bedrock-agent-runtime';

const client = new BedrockAgentRuntimeClient({ region: 'us-east-1' });

const response = await client.send(new RetrieveAndGenerateCommand({
  input: { text: 'What is our refund policy?' },
  retrieveAndGenerateConfiguration: {
    type: 'KNOWLEDGE_BASE',
    knowledgeBaseConfiguration: {
      knowledgeBaseId: 'your-kb-id',
      modelArn: 'arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-5-sonnet-20241022-v2:0',
    },
  },
}));

Fine-Tuning

Bedrock supports supervised fine-tuning and continued pre-training for select models (Amazon Titan, Meta Llama, Cohere). Upload training data to S3, run a fine-tuning job, and deploy the resulting model — all without managing GPU infrastructure.

Fine-tuning is billed separately (training compute hours + storage), and fine-tuned models can be deployed with provisioned throughput for guaranteed performance.

Pricing Model

Bedrock uses on-demand token pricing with an optional provisioned throughput model:

On-demand (pay-per-token):

  • Input and output tokens billed separately
  • No minimum commitment
  • Prices vary by model (Claude 3.5 Sonnet is more expensive than Claude 3 Haiku)

Provisioned throughput:

  • Reserve model units (MUs) on a 1-month or 6-month commitment
  • Guarantees throughput at high volume
  • Required for fine-tuned model deployment

Example on-demand prices (as of early 2025, check AWS for current):

Model Input (per 1M tokens) Output (per 1M tokens)
Claude 3.5 Sonnet $3.00 $15.00
Claude 3 Haiku $0.25 $1.25
Llama 3.1 70B $0.72 $0.72
Mistral Large $4.00 $12.00

No markup over provider prices — you pay AWS list rates.

VPC and Network Security

Bedrock supports AWS PrivateLink and VPC endpoints, letting you invoke models without traffic leaving your VPC:

# Create a VPC endpoint for Bedrock
aws ec2 create-vpc-endpoint \
  --vpc-id vpc-xxxx \
  --service-name com.amazonaws.us-east-1.bedrock-runtime \
  --vpc-endpoint-type Interface \
  --subnet-ids subnet-xxxx \
  --security-group-ids sg-xxxx

This is a major differentiator from cloud-based gateways — your inference traffic never traverses the public internet.

Setting Up Bedrock

# Install AWS SDK packages
npm install @ai-sdk/amazon-bedrock @aws-sdk/client-bedrock-runtime

# Configure credentials
export AWS_ACCESS_KEY_ID=your-access-key
export AWS_SECRET_ACCESS_KEY=your-secret-key
export AWS_REGION=us-east-1

Enable model access in the AWS console (Bedrock → Model access → Enable specific models). Not all models are enabled by default — some require requesting access and waiting for approval.

When to Choose AWS Bedrock

Bedrock is the right choice when:

  • Your organization is already heavily invested in AWS (IAM, VPC, CloudWatch, S3)
  • You need VPC-isolated AI inference for compliance or regulatory reasons (HIPAA, FedRAMP, SOC 2)
  • You want fine-tuning without managing GPU infrastructure
  • You need a RAG pipeline that stays within the AWS ecosystem
  • Your team already manages IAM and prefers no additional credential systems

Consider alternatives when:

  • You're not on AWS and don't want to be
  • You need to route across cloud providers (e.g., use Azure OpenAI alongside Anthropic)
  • You want semantic caching to reduce costs (Bedrock doesn't offer this)
  • You need detailed observability beyond what CloudWatch provides
  • You're a startup without existing AWS infrastructure

Related Resources

Ready to build?

Go from idea to launched product in a week with AI-assisted development.