Auth migration has a reputation for being the job that ends careers. You schedule a maintenance window, migrate the user database, flip the DNS, and then spend three hours watching locked-out customers rage-post on social. AWS just published the techniques their Cognito team used to migrate hundreds of millions of user profiles to brand-new infrastructure without a single maintenance window or customer lockout. Those same techniques — shadow mode, dual-write, anti-entropy validation — are exactly what D2C engineering teams need when they're moving off an auth provider that's become too expensive or too limited.
TL;DR: Cognito's infrastructure rewrite proves that zero-downtime auth migration is an engineering problem with a solved playbook, not a risk you accept maintenance windows for. If you're scoping an auth provider switch — Auth0 pricing, GDPR residency, or peak-traffic throttling — book a 30-min call with Dev, no SDR layer. We'll map the migration against your Q4 calendar.
What AWS Actually Changed in Cognito's Infrastructure
The core architectural shift: Cognito moved from a single shared data store (Amazon Cloud Directory) to independently deployable domains. The old design made Cognito reliable but slow to evolve — changing one part of the system required coordinating across the entire shared layer. The new design lets AWS ship features to individual domain components without touching others.
Three new capabilities came directly from that architectural freedom:
- High-throughput performance — support for tens of millions of users per user pool and thousands of TPS. The previous architecture had soft throttle limits that caused auth failures for D2C brands running large simultaneous login events during flash sales or email campaign sends.
- Customer-managed KMS encryption keys — user data at rest (passwords, attributes, custom fields) can now be encrypted under a key you control. Required for GDPR Article 17 right-to-erasure compliance and SOC 2 Type II controls that audit key lifecycle management.
- Multi-region replication — user pool data synchronized to a secondary regional pool for failover. For D2C brands expanding from US East to EU West, this is your GDPR data residency path without running two separate identity systems.
AWS was explicit that backward compatibility was a non-negotiable design constraint. Existing SDK calls, OAuth flows, and token formats work without modification. The new capabilities are opt-in.
Why Auth Migration Is Nothing Like Database Migration
Database migrations have a recoverable failure mode: if the migration breaks, you roll back and the data is still there. Auth migrations don't have that luxury. A failed auth migration means every active session token is invalid and every user trying to log in gets a 401. For a D2C brand with 200K MAUs, that's potentially tens of thousands of customers hitting a broken login page during peak hours.
That's why most teams schedule maintenance windows — they want to force everyone offline, migrate cleanly, and bring the new system up. The problem is that maintenance windows don't work for D2C. Your customers are global, your flash sales run at unpredictable times, and a 4 AM window that's quiet for US East is prime time for your EU customers. There's no safe hour to lock everyone out.
The AWS Cognito migration proved there's a better answer: run both systems simultaneously, compare their outputs continuously, and only cut over when you've validated parity at the record level. That's what shadow mode accomplishes.
The Shadow-Mode + Dual-Write Pattern, Explained
AWS used five techniques in combination. For D2C auth migrations, three of them are the core of what we implement:
Dual-Write
Every write operation — new user registration, password change, attribute update — goes to both the old and new system simultaneously. Neither system is authoritative yet; both receive the same data. This ensures the new system's user database is always current with the production one, even during the migration period.
Shadow Mode
Auth requests are processed by both systems in parallel. The old system's response goes to the client; the new system's response is logged and compared. If they match, confidence in the new system goes up. If they don't, the discrepancy is logged for investigation — and the old system's response is always what the customer sees. No customer ever experiences the new system's response until parity is validated.
Anti-Entropy Validation
A nightly Lambda job compares every user record in both systems — hashed passwords, custom attributes, session state — and flags any that differ. This catches drift caused by edge cases that shadow mode didn't exercise: users who haven't logged in during the migration period, records created before the dual-write started, timing races in concurrent write scenarios. AWS called this out as the technique that catches what automated testing alone would miss.
Auth migrations fail at the edges.
We've run this migration pattern across several D2C builds on AWS infrastructure. If you want our scoping checklist and timeline estimate for your specific user volume and auth provider, grab 30 minutes with Dev — written brief inside a week.
How We Applied This in a Real D2C Auth Migration
A $16M outdoor gear brand came to us needing to move off Auth0 before Q4. Auth0's 2025 pricing restructure had tripled their monthly cost — they were paying $1,800/month for 280K MAUs and the new tier pushed that to $5,400/month. Moving before Black Friday wasn't optional financially, but taking a maintenance window with 14 weeks to their peak was also a non-starter.
We ran a six-week migration using the same three-phase approach AWS published. Phase one: dual-write setup and initial backfill of all existing user records into Cognito using a batched import with conflict resolution. Phase two: shadow mode for four weeks — every Auth0 login request was simultaneously processed by Cognito, responses compared, discrepancies logged. We surfaced three edge cases that would have caused login failures: a custom attribute format difference, a token expiry handling gap, and a third-party identity provider (Google) callback URL that needed a Cognito-specific adjustment.
Phase three: cutover. Auth0 stayed live in passthrough mode for two additional weeks to handle any users with cached sessions that hadn't refreshed. Zero customer lockouts. The anti-entropy Lambda ran nightly for three weeks post-migration and flagged 47 records with attribute drift — all resolved before the old system was decommissioned. Monthly auth cost went from $5,400 to $840 on Cognito's MAU pricing at that volume.
When to Actually Migrate Off Your Current Auth Provider
Not every D2C brand needs to migrate. The signals that make a migration worth the engineering work:
- Auth cost is growing faster than revenue — Auth0 and Firebase Auth both restructured pricing in 2024–2025. If you're above 50K MAUs and haven't re-evaluated your auth cost per MAU recently, do it now. Cognito's pricing is per MAU with a generous free tier and a significantly lower per-MAU rate at 100K+.
- EU expansion is on the 18-month roadmap — GDPR Article 17 right-to-erasure requires controlling the encryption key lifecycle for any PII stored in your auth system. Cognito's new KMS integration gives you that on AWS without standing up a separate identity infrastructure in the EU. Multi-region replication handles the data residency requirement.
- You've had auth throttling during a peak event — if your login success rate dropped during a flash sale or large email campaign, you've already hit the old Cognito TPS limits or your current provider's shared infrastructure ceiling. The new Cognito architecture removes that constraint.
- Your custom attribute data has grown significantly — Cognito's old architecture limited per-user custom attributes in ways that caused D2C teams to store user preferences and loyalty data elsewhere. The new infrastructure handles richer attribute sets cleanly, which can simplify your data architecture if you're currently splitting identity data across two stores.
Four Mistakes D2C Teams Make When Planning Auth Migrations
- Treating the token format as migration-neutral — JWT claims structures differ between providers. An Auth0 token has a different claim namespace than a Cognito token. Every downstream service that reads token claims needs to be updated before cutover, not after. We map every claim consumer before we write a line of migration code.
- Skipping dormant account coverage in shadow mode — shadow mode only validates users who actually log in during the migration period. A brand with 280K MAUs might have 60K users who haven't logged in for six months. Those users exist in the old system, need to be in the new system, but will never be validated by shadow mode. Anti-entropy is the only thing that catches them.
- Cutting over before social auth parity is confirmed — Google, Apple, and Meta OAuth callbacks need to be reconfigured in each provider's developer console. These are often managed by whoever set them up originally and that person may not be on the team anymore. Audit every social auth provider before the migration starts, not during.
- Decommissioning the old system the week after cutover — keep the old system in passthrough-only mode for at least two weeks post-cutover. Users with long-lived refresh tokens (mobile apps especially) will try to use them against the old system. Passthrough mode handles those gracefully; an offline old system causes hard failures.
If you're running Cognito today and wondering how the new capabilities apply to your setup, our post on Cognito for e-commerce authentication covers the baseline configuration patterns that the next-gen infrastructure builds on top of. For D2C brands using AWS broadly, our AWS Shield flow logs post addresses the traffic-spike visibility question that often comes up alongside auth infrastructure reviews.
Frequently Asked Questions
Does Amazon Cognito's next-generation infrastructure require changes to my existing integration?
No. AWS built the next-gen infrastructure with backward compatibility as a hard requirement — existing SDK calls, token formats, and OAuth flows work without modification. The architectural shift from a shared Cloud Directory backend to independently deployable domains is fully transparent to applications. The new capabilities — customer-managed KMS encryption, multi-region replication, and higher throughput — are opt-in, not automatic. You enable them when you need them, on your timeline.
What signals tell a D2C brand it's time to migrate to a different auth provider?
Three signals we watch for: (1) Auth pricing is growing faster than MAU count — Auth0 and Firebase Auth both restructured pricing in 2024–2025 and the unit economics break for brands above 50K MAUs. (2) EU expansion is on the roadmap — GDPR Article 17 right-to-erasure requires you to control the encryption key lifecycle for PII, which Cognito's new KMS integration provides out of the box on AWS. (3) You've had auth-related availability issues during a peak sale — Cognito's new high-throughput infrastructure supports thousands of TPS per user pool, removing the throttle ceiling that caused auth failures for brands with large simultaneous login events.
How long does a zero-downtime auth migration take for a D2C brand with 100K–500K MAUs?
From our work: typically 6–10 weeks end to end. One week scoping and data mapping, one to two weeks dual-write setup and initial backfill, three to five weeks shadow mode with discrepancy logging, one week final cutover with the old system in passthrough-only mode for cached sessions, and one week post-migration monitoring. The shadow-mode duration depends on your user activity distribution — brands with a large percentage of dormant accounts need longer to surface edge cases in inactive sessions before the anti-entropy pass can clear them.
About the author
AWS Practice Lead, Braincuber Technologies
Owns AWS architecture and cloud cost optimization at Braincuber. Designs production workloads on Bedrock, SageMaker, Lambda, and EC2 for US clients — averaging $4,200/month in cost savings on right-sizing audits.

