Bedrock or Anthropic direct?

Bedrock for US deployments with HIPAA BAA or SOC 2 compliance. Direct Anthropic API for everything else — usually 8-15% cheaper at scale due to no AWS markup.

Ongoing AWS cost management offered?

Yes — monthly cost review + remediation under SLA tier. Average client savings $4,200/month within 6 weeks of first audit.

AI on AWS: Real Cost from 24 Builds ($800-$45K/mo)

Q: How do we cut AI costs on AWS?

9 levers: prompt caching, Spot for non-prod, VPC endpoints to eliminate NAT data fees, S3 Intelligent-Tiering, RDS Reserved Instances, CloudWatch retention, model right-sizing, eval sampling, snapshot deletion. Average savings 40-65%.

Q: When does SageMaker beat Bedrock?

Above 50K requests/day with a fine-tuned smaller model performing comparably. Below that Bedrock pay-per-token wins.

Quick answer

Production AI on AWS costs $800/month for a focused chatbot to $45,000/month for an enterprise multi-agent system, based on 24 builds we shipped 2024-2026. The published list-price quotes nearly always miss three line items that bust budgets: NAT Gateway data processing fees ($600-$2,400/mo), inter-AZ data transfer ($0.01-$0.02/GB compounding), and CloudWatch Logs ingest above the free tier. We have audited 18 invoices where the actual monthly bill ran 2-4x the quoted number.

$800-$45K

monthly range

builds tracked

2-4x

quote-vs-actual gap

Get a free AWS cost audit →

The four cost tiers from 24 production builds

Tier	Use case	Monthly AWS	Median inference
Focused chatbot	Single workflow, <500 req/day	$800 - $2,400	Bedrock Claude $0.02-$0.08 per req
Multi-flow agent	3-5 tools, 1K-5K req/day	$2,400 - $9,000	$0.05-$0.18 per req
Full ops agent	5-10 tools, 5K-25K req/day	$9,000 - $22,000	$0.08-$0.24 per req
Enterprise multi-agent	10+ agents, 25K+ req/day	$22,000 - $45,000	$0.12-$0.40 per req

The 3 line items that turn $5K quotes into $23K invoices

NAT Gateway data processing — $600-$2,400/month. Every byte going through a NAT Gateway is charged at $0.045/GB. AI agents calling external APIs (Anthropic, OpenAI, third-party tools) through a private VPC hit this on every request. Most quotes show the $0.045/hour NAT Gateway fee but forget the processing fee.
Inter-AZ data transfer — $0.01-$0.02 per GB, both directions. Multi-AZ Bedrock + multi-AZ RDS + multi-AZ EKS = three round trips per request, each charged. On a 25K-request-per-day workload this can add $1,200-$3,600/month that does not show up in any "what does Bedrock cost" calculator.
CloudWatch Logs above free tier — $0.50 per GB ingested. AI agents log heavily for eval pipelines. A multi-agent system can ingest 30-80 GB/month of structured logs. That is $15-$40/month per agent. Sounds small until you have 10 agents.

Getting hit by surprise AWS bills? Free 30-min AWS audit. We look at your last 3 invoices + Cost Explorer view, identify the top 3 cost leaks, and give you a written remediation plan.

Book a free AWS audit →

Bedrock vs SageMaker vs EC2 GPU — when to use which

Bedrock — pay-per-token, no infrastructure to manage. Best for <25K requests/day. Median cost ~$0.06 per request on Claude Sonnet 4.7. Wins on operational simplicity for <90% of use cases.
SageMaker real-time endpoints — host your own fine-tuned model. Best when you have a custom model that delivers better-per-dollar accuracy than Bedrock. Median fixed cost $850-$1,800/month for a ml.g5.xlarge endpoint. Break-even vs Bedrock typically at 50K+ requests/day.
EC2 GPU instances — full self-host. Best when you have ITAR/GovCloud constraints or specific GPU needs Bedrock cannot meet. Highest operational overhead. We recommend this only when forced by constraints, not by cost preference.

Real bill: $5,000 quote vs $22,800 actual (client engagement March 2025)

A US e-commerce client came to us after their first month on AWS Bedrock hit $22,800 against a vendor quote of $5,000. Breakdown of the gap:

Line item	Quoted	Actual
Bedrock inference (Claude)	$4,200	$11,400
NAT Gateway data processing	$0 (missing)	$2,800
Inter-AZ data transfer	$0 (missing)	$3,400
CloudWatch Logs	$0 (missing)	$1,900
RDS Postgres (eval store)	$400	$1,200
S3 + CloudFront	$200	$680
EC2 (eval runner)	$200	$420
Total	$5,000	$22,800

After our audit + remediation (Bedrock prompt caching, NAT Gateway → VPC endpoints for AWS-internal traffic, CloudWatch Logs retention reduced, RDS reserved instance) the monthly bill dropped to $8,200. Same workload. Different architecture.

FAQ

Should we use Bedrock or call Anthropic directly?

Bedrock for any US deployment with compliance constraints (HIPAA BAA, SOC 2 audit-log capture). Direct Anthropic API for everything else — usually 8-15% cheaper at scale due to no AWS markup. We benchmark both per engagement.

How do we cut AI costs on AWS?

9 levers, ranked: prompt caching (Bedrock + Claude direct), Spot for non-prod, VPC endpoints to eliminate NAT data fees, S3 Intelligent-Tiering, RDS Reserved Instances, CloudWatch retention windowing, model right-sizing (Haiku vs Sonnet vs Opus), eval-set sampling, and old-snapshot deletion. Average savings across 18 audits: 40-65%.

When does SageMaker beat Bedrock on cost?

Above 50K requests/day for a workload where you have fine-tuned a smaller model that performs comparably. Below that, Bedrock pay-per-token nearly always wins.

Do you offer ongoing AWS cost management?

Yes — monthly cost review + remediation as part of our SLA tier. Typical clients see $4,200/month saved on average within 6 weeks of the first audit.

Free audit, 48-hour delivery

Stop bleeding cash on AWS

Send us your last 3 AWS invoices + your Cost Explorer view. We come back in 48 hours with the top 3 cost leaks, the projected savings, and a fixed-price remediation quote if you want us to ship the fix. Average savings across 18 audits: $4,200/month.

Book a free 30-min AWS audit → AWS Consulting Services →

Related resources

9 cost levers that cut AI bills 40-65%
AI on AWS Bedrock — full architecture guide
Case study: HIPAA AI rebuild on AWS Bedrock

Methodology

Cost ranges and quote-vs-actual data from 24 Braincuber AI on AWS engagements shipped 2024-2026 and 18 documented client AWS cost audits. AWS Bedrock + Claude Sonnet 4.7 pricing referenced as of April 2026 public list rates. Inter-AZ data transfer pricing per AWS Data Transfer pricing page. Client invoice details anonymized; specific dollar amounts published with permission. The $4,200/month average savings figure is the median across 18 completed audits, not a maximum.

The four cost tiers from 24 production builds

Tier

Use case

Monthly AWS

Median inference

Focused chatbot

Single workflow, <500 req/day

$800 - $2,400

Bedrock Claude $0.02-$0.08 per req

Multi-flow agent

3-5 tools, 1K-5K req/day

$2,400 - $9,000

$0.05-$0.18 per req

Full ops agent

5-10 tools, 5K-25K req/day

$9,000 - $22,000

$0.08-$0.24 per req

Enterprise multi-agent

10+ agents, 25K+ req/day

$22,000 - $45,000

$0.12-$0.40 per req

The 3 line items that turn $5K quotes into $23K invoices

NAT Gateway data processing — $600-$2,400/month. Every byte going through a NAT Gateway is charged at $0.045/GB. AI agents calling external APIs (Anthropic, OpenAI, third-party tools) through a private VPC hit this on every request. Most quotes show the $0.045/hour NAT Gateway fee but forget the processing fee.

Inter-AZ data transfer — $0.01-$0.02 per GB, both directions. Multi-AZ Bedrock + multi-AZ RDS + multi-AZ EKS = three round trips per request, each charged. On a 25K-request-per-day workload this can add $1,200-$3,600/month that does not show up in any "what does Bedrock cost" calculator.

CloudWatch Logs above free tier — $0.50 per GB ingested. AI agents log heavily for eval pipelines. A multi-agent system can ingest 30-80 GB/month of structured logs. That is $15-$40/month per agent. Sounds small until you have 10 agents.

Getting hit by surprise AWS bills? Free 30-min AWS audit. We look at your last 3 invoices + Cost Explorer view, identify the top 3 cost leaks, and give you a written remediation plan.

Book a free AWS audit →

Bedrock vs SageMaker vs EC2 GPU — when to use which

Bedrock — pay-per-token, no infrastructure to manage. Best for <25K requests/day. Median cost ~$0.06 per request on Claude Sonnet 4.7. Wins on operational simplicity for <90% of use cases.

SageMaker real-time endpoints — host your own fine-tuned model. Best when you have a custom model that delivers better-per-dollar accuracy than Bedrock. Median fixed cost $850-$1,800/month for a ml.g5.xlarge endpoint. Break-even vs Bedrock typically at 50K+ requests/day.

EC2 GPU instances — full self-host. Best when you have ITAR/GovCloud constraints or specific GPU needs Bedrock cannot meet. Highest operational overhead. We recommend this only when forced by constraints, not by cost preference.

Real bill: $5,000 quote vs $22,800 actual (client engagement March 2025)

A US e-commerce client came to us after their first month on AWS Bedrock hit $22,800 against a vendor quote of $5,000. Breakdown of the gap:

Line item

Quoted

Actual

Bedrock inference (Claude)

$4,200

$11,400

NAT Gateway data processing

$0 (missing)

$2,800

Inter-AZ data transfer

$0 (missing)

$3,400

CloudWatch Logs

$0 (missing)

$1,900

RDS Postgres (eval store)

$400

$1,200

S3 + CloudFront

$200

$680

EC2 (eval runner)

$200

$420

Total

$5,000

$22,800

FAQ

Should we use Bedrock or call Anthropic directly?

How do we cut AI costs on AWS?

When does SageMaker beat Bedrock on cost?

Above 50K requests/day for a workload where you have fine-tuned a smaller model that performs comparably. Below that, Bedrock pay-per-token nearly always wins.

Do you offer ongoing AWS cost management?

Yes — monthly cost review + remediation as part of our SLA tier. Typical clients see $4,200/month saved on average within 6 weeks of the first audit.

Free audit, 48-hour delivery

Stop bleeding cash on AWS

Book a free 30-min AWS audit → AWS Consulting Services →

Related resources

9 cost levers that cut AI bills 40-65%
AI on AWS Bedrock — full architecture guide
Case study: HIPAA AI rebuild on AWS Bedrock

Methodology

How Much Does It Cost to Deploy AI on AWS?

The four cost tiers from 24 production builds

The 3 line items that turn $5K quotes into $23K invoices

Bedrock vs SageMaker vs EC2 GPU — when to use which

Real bill: $5,000 quote vs $22,800 actual (client engagement March 2025)

FAQ

Stop bleeding cash on AWS

Methodology

Let's find what's breaking — and fix it

How Much Does It Cost to Deploy AI on AWS?

The four cost tiers from 24 production builds

The 3 line items that turn $5K quotes into $23K invoices

Bedrock vs SageMaker vs EC2 GPU — when to use which

Real bill: $5,000 quote vs $22,800 actual (client engagement March 2025)

FAQ

Stop bleeding cash on AWS

Methodology

Let's find what's breaking — and fix it