How to Manage Blue-Green Deployments on AWS ECS with Database Migrations: Complete Guide
By Braincuber Team
Published on March 2, 2026
We watched a D2C brand's checkout go dark for 47 minutes during a "zero-downtime" deployment because they dropped a database column that the old version still needed. Their blue environment crashed the instant CodeDeploy rolled traffic back. The "instant rollback" they'd been promised? Useless. The database had already moved on without them. This complete tutorial is for anyone who's been told blue-green deployments are simple. They're not — the second you add a shared database into the mix. This step by step guide covers the expand-contract pattern, ECS Fargate, CodeDeploy traffic shifts, and the 3 rollback strategies you actually need.
What You'll Learn:
- Why blue-green deployments break with shared databases (and how to fix it)
- The expand-contract pattern for backwards-compatible schema migrations
- Setting up ECS Fargate with ALB target groups and CodeDeploy
- Using Terraform to provision the full blue-green infrastructure
- Executing a linear 10%-every-3-minutes traffic shift
- 3 rollback strategies depending on where you are in the deployment
- Monitoring CloudWatch metrics during traffic shifts
The Database Problem Nobody Warns You About
Blue-green works perfectly for stateless apps. Deploy green alongside blue, shift traffic, done. But the moment both environments connect to the same RDS instance, you're sitting on a landmine. Blue expects schema version N. Green expects version N+1. Any breaking schema change — a dropped column, a renamed field, a changed constraint — and one environment fails.
Schema Versioning Conflicts
Blue expects column address (text). Green expects street_address, city, state, zip_code. Drop the old column before green is stable and your rollback path is gone. Both versions must coexist against the same schema during the transition.
Irreversible Migrations
Dropping a column, changing a data type, restructuring a table — these can't be undone with a simple ALTER TABLE. Once you contract, you need a database snapshot restore to roll back. That takes 10-30 minutes and you lose all data written after the snapshot.
Failed Rollbacks
The promise of "instant rollback" is hollow when your database has evolved past what blue can handle. CodeDeploy shifts traffic back to blue in seconds — but blue crashes immediately because the columns it depends on are gone. That's the 47-minute outage scenario.
Data Inconsistencies
Green writes data in a new format. Blue can't read it. Traffic shifts back — and now blue is serving corrupted or missing data to real customers. Every order, every profile update written by green becomes an error for blue.
3 Database Migration Strategies (Pick One)
Not all strategies are equal. The expand-contract pattern should be your default. The other two exist for edge cases.
| Strategy | When to Use | Complexity | Cost |
|---|---|---|---|
| Expand-Contract (Recommended) | Adding/removing columns, renaming fields, changing constraints | Medium — 3 phases | Low — single DB |
| Parallel Schemas / Databases | Complete data model redesigns, switching DB engines (MySQL to Postgres) | High — DMS sync | 2x DB cost |
| Feature Flags + Gradual Rollout | Large uncertain features, percentage-based rollouts (5% to 100%) | Medium — code branches | Low — AWS AppConfig |
The Expand-Contract Pattern — How It Actually Works
Break every schema change into 3 phases. Never drop anything until the old version is fully decommissioned. This is the only pattern that keeps your rollback path alive throughout the entire deployment.
-- Renaming 'address' to structured fields
-- Phase 1: EXPAND — add new columns, keep old
ALTER TABLE customers ADD COLUMN street_address VARCHAR(255);
ALTER TABLE customers ADD COLUMN city VARCHAR(100);
ALTER TABLE customers ADD COLUMN state VARCHAR(50);
ALTER TABLE customers ADD COLUMN zip_code VARCHAR(20);
-- Backfill from old column
UPDATE customers SET street_address = address WHERE street_address IS NULL;
-- Phase 3: CONTRACT — only after green is 100% stable
-- Take RDS snapshot FIRST
ALTER TABLE customers DROP COLUMN address;
Never Contract Without a Snapshot
Take an RDS snapshot immediately before running the contract migration. Once you drop the old column, your only rollback path is restoring from this snapshot — which takes 10-30 minutes and loses all data written after the snapshot. This is why you wait 24-72 hours before contracting. That window lets you catch issues while the safe rollback path (just shifting traffic back to blue) is still available.
The 9-Step Deployment Walkthrough
This is a real end-to-end example: migrating a single address field to structured street_address, city, state, and zip_code fields on an ecommerce app running ECS Fargate.
Deploy Infrastructure and Blue Environment with Terraform
Clone the repo, create terraform.tfvars with your AWS region, DB credentials, VPC CIDR, and container config. Run terraform init, then terraform apply -target=aws_ecr_repository.app to create ECR first. Build your Docker image with docker build --platform linux/amd64, push it to ECR, update container_image in tfvars, then terraform apply the full stack. Takes ~15-20 minutes. This provisions your ECS cluster, ALB with two target groups (blue + green), RDS Postgres, CodeDeploy app, and a bastion host.
Initialize the Database Schema via Bastion
SCP your init.sql and migration files to the bastion host. SSH in, then run psql -h $DB_ENDPOINT -U dbadmin -d ecommerce -f /tmp/init.sql to create the initial schema with the single address column. Verify with \d customers. This is your blue schema — version 1 of the database that the currently running app expects.
Verify the Blue Environment Is Healthy
Hit curl $ALB_URL/health — you should get {"status":"healthy","version":"blue","database":"connected","schema":"compatible"}. Create a test customer with the old single-address format: curl -X POST $ALB_URL/api/customers with a JSON body containing name, email, and address. Verify it persists. This baseline proves the system is healthy before you start changing anything. If something breaks after step 4, you know it's the migration.
Run the Expand Migration — Add New Columns
SSH into bastion and run the expand migration: ALTER TABLE customers ADD COLUMN street_address, city, state, zip_code. Backfill existing data from the old address column. The critical rule: do NOT drop the old column. Blue is still running against this database and needs it. After this step, the schema supports both v1 (reads address) and v2 (reads structured fields). Verify blue still works by creating another customer.
Build and Push the Green Application Image
Update your application code so version 2 reads from the new structured fields but writes to both old and new columns simultaneously. This dual-write ensures blue can still read data if you roll back. Tag the image as v2.0.0, build with docker build --platform linux/amd64, push to ECR. Register a new ECS task definition with APP_VERSION=green and the v2 image URI. Update the ECS service to use the new task definition.
Execute the Blue-Green Deployment via CodeDeploy
Create an appspec.json pointing to the new task definition. Trigger the deployment with aws deploy create-deployment using CodeDeployDefault.ECSLinear10PercentEvery3Minutes. This shifts 10% of traffic to green every 3 minutes, completing in ~30 minutes. Monitor with watch -n 10 on the deployment status. Test green via the test listener (port 8080) while production traffic is still on blue.
Validate the Green Environment Under Load
Create customers using the new structured address format on the production URL. Verify existing customers (created by blue) still render correctly. Check TargetResponseTime in CloudWatch — green should be within 10-20% of blue's baseline. Monitor HTTPCode_Target_5XX_Count — even a single 500 during traffic shift warrants investigation. Watch DatabaseConnections for pool exhaustion and CPUUtilization on both ECS tasks and RDS.
Monitor for 24-72 Hours Before Contracting
Do NOT drop old columns yet. Green is serving 100% of traffic, but the old address column is still in the database. If anything goes wrong in the next 24-72 hours, you can instantly shift traffic back to blue via aws elbv2 modify-listener pointing to the blue target group. During this window, watch business metrics: checkout completion rates, API success rates, customer complaints. Technical metrics look fine sometimes when user-facing functionality is broken.
Run the Contract Migration and Clean Up
After 24-72 hours with zero issues: take an RDS snapshot, then run ALTER TABLE customers DROP COLUMN address. Deploy version 3 of your app that removes dual-write logic (only writes to new columns). Decommission the blue environment. Update your Terraform state. From this point, rollback requires a database snapshot restore — which is why you waited 72 hours and confirmed everything works before reaching this point.
The 3 Rollback Strategies (Know Which One You're In)
Your rollback options depend entirely on when you catch the problem. The further along the deployment, the more painful the rollback. This is why monitoring matters.
| Timing | Method | Speed | Data Loss |
|---|---|---|---|
| During traffic shift | aws deploy stop-deployment --auto-rollback-enabled | Seconds | None |
| After deploy, before contract | aws elbv2 modify-listener → blue target group | Seconds | None |
| After contract phase | aws rds restore-db-instance-from-db-snapshot | 10-30 minutes | All writes after snapshot |
5 CloudWatch Metrics to Watch During Traffic Shifts
Don't deploy blind. These 5 metrics tell you whether green is healthy or heading for a crash. Monitor both target groups simultaneously.
TargetResponseTime = Green should match blue within 10-20%
RequestCount = Blue should decrease as green increases proportionally
HTTPCode_Target_5XX = Zero tolerance during shift. One 5XX = investigate
DatabaseConnections = Watch for pool exhaustion (spike or plateau at max)
CPUUtilization = ECS + RDS. Green higher than blue? New queries need indexes
When NOT to Use Blue-Green
Very large migrations that take hours or require table locks — use a maintenance window. WebSocket apps with complex in-memory state — use rolling deployments. Cost-constrained teams — running two environments doubles cost; use canary deployments instead. Complete data model redesigns — use the strangler fig pattern to migrate gradually.
Frequently Asked Questions
What is the expand-contract pattern in database migrations?
It splits schema changes into 3 phases: Expand (add new columns, keep old ones), Deploy (green uses new columns but writes to both), Contract (drop old columns after 24-72 hours). This keeps both blue and green compatible with the same database throughout the deployment.
How long should I wait before running the contract migration?
Wait 24-72 hours after green is serving 100% of traffic. This window lets you catch bugs, performance regressions, and business metric changes while the safe rollback path (shifting traffic back to blue) is still available. Once you contract, rollback requires a database snapshot restore.
Can I roll back a blue-green deployment on AWS ECS after CodeDeploy completes?
Yes, if you haven't run the contract migration yet. Use aws elbv2 modify-listener to point the ALB listener back to the blue target group. The database still has both old and new columns, so blue functions normally. This takes seconds with zero data loss.
What AWS services are required for blue-green deployments on ECS?
You need an ECS cluster (Fargate), an Application Load Balancer with two target groups, AWS CodeDeploy for traffic shifting, RDS for the shared database, and optionally Parameter Store for connection strings. Terraform or CDK can provision all of this as infrastructure-as-code.
How does CodeDeploy ECSLinear10PercentEvery3Minutes work?
It shifts 10% of production traffic from blue to green every 3 minutes. The full cutover takes about 30 minutes. During this window, you can stop the deployment with --auto-rollback-enabled to instantly revert all traffic to blue if you detect errors or latency spikes.
Losing Sleep Over Production Deployments?
We've built blue-green pipelines for D2C brands doing $1M-$10M on ECS, with expand-contract migrations, CloudWatch alarm auto-rollbacks, and Terraform infrastructure-as-code. Stop sweating every deploy. Get a deployment pipeline that lets you ship at 3 PM on a Friday without flinching.
