Updated June 2026 · Originally published March 25, 2026
A 340-bed US hospital got an OCR (Office for Civil Rights) audit that flagged 17 PHI exposure risks across five AI tools, two of which were sending patient data to public model endpoints with no BAA in place. Our team rebuilt the entire stack as HIPAA-compliant AI on AWS Bedrock: a signed BAA, PHI scrubbing at every tool boundary, VPC PrivateLink, and CloudWatch logs feeding their SIEM. Eleven weeks from kickoff to go-live. Zero findings on the OCR re-review, and $1.2M in annual savings on the workflows that moved.
- ✓ Amazon Bedrock is HIPAA-eligible and covered by the standard AWS BAA, accepted in AWS Artifact. Eligibility alone does not make you compliant.
- ✓ Boundary PHI scrubbing closed 14 of the 17 audit findings. The remaining 3 were policy and retention gaps.
- ✓ The model never sees raw PHI. Tokenization replaces every identifier with a stable token before any Bedrock call.
- ✓ Physician adoption went from 41% on the legacy tools to 89%, driven mostly by latency dropping from 4.1s to 380ms.
This is one engagement, anonymized under NDA, written up with the CIO's permission. HIPAA-compliant AI on AWS is an architecture decision, not a checkbox you buy with a service name. We have shipped 8 healthcare AI rebuilds on AWS Bedrock with signed BAAs since 2024, and the same gaps show up every time: PHI leaking at tool boundaries, no audit trail, and three vendors each with their own definition of compliant. Here is exactly what we found, what we built, and what the numbers looked like after six months in production.
Is AWS Bedrock HIPAA-compliant?
Yes, with the right architecture. Amazon Bedrock (AWS managed foundation-model service) is on the AWS HIPAA-eligible services list and is covered by the standard AWS Business Associate Agreement, which you accept in AWS Artifact rather than negotiate from scratch. AWS confirms the model provider never sees your prompts or Bedrock logs, and data does not leave the service to train the underlying model. That eligibility is the floor, not the ceiling. As AWS itself puts it in its guidance on HIPAA compliance for generative AI, the customer still owns scrubbing, encryption, network isolation, and audit logging under the shared-responsibility model. Bedrock eligibility closed none of this hospital's 17 findings on its own. The architecture we wrapped around it did.
What was actually broken: 17 PHI exposure risks
The client is a 340-bed acute-care hospital in the Northeast US. Like many health systems in 2024 and 2025, they had layered on five AI tools fast: an ambient clinical-notes scribe, a prior-authorization drafting assistant, a denial-letter classifier, a discharge-summary summarizer, and a patient-portal triage chatbot. Three vendors. Two of them were sending PHI to public model endpoints under terms that did not include a BAA.
The OCR audit triggered after a partner health plan caught the discharge summarizer sending member identifiers in a request that should have been de-identified. The findings were specific: 17 distinct exposure vectors across the five tools, from "API request body contains MRN" to "audit log retention does not meet the HIPAA documentation minimum." None of it was exotic. All of it was the predictable result of speed without a boundary.
How we rebuilt it: a BAA-first architecture on AWS Bedrock
We scoped the engagement at $187,000 over 11 weeks, paid against four milestones. The CIO insisted on a single accountable vendor for the whole AI surface, which made the architecture decisions cleaner. Here is what went in:
- AWS Bedrock as the only model endpoint. We ran Claude 3.7 Sonnet on Bedrock for reasoning-heavy work and Llama 3.3 70B where Claude was overkill on cost. The standard AWS BAA covering Bedrock was accepted in AWS Artifact in week one. (A new build today would use the current Claude Sonnet generation, with the same architecture.)
- PHI scrubbing at the tool boundary. Every request body and every tool output passes through a deterministic scrubber (Microsoft Presidio plus custom regex for MRN, NPI, DOB, address) before anything is logged. Scrubbed payloads are what land in CloudWatch.
- VPC plus PrivateLink to Bedrock. No traffic touches the public internet. Bedrock is reached through VPC endpoints, and all compute sits in dedicated VPCs with strict NACLs.
- KMS-encrypted everything. Each tool gets its own KMS customer-managed key. S3, EBS, RDS, and CloudWatch log groups are encrypted with per-workflow keys so revocation is surgical.
- SIEM integration. A CloudWatch Logs subscription filter ships to the hospital's existing Splunk Cloud instance with their standard HIPAA dashboard pack.
Sitting on an OCR finding or a BAA gap of your own? We will walk your current AI surface and your audit gaps in 30 minutes.
Book a free HIPAA call →How did PHI scrubbing close 14 of the 17 findings?
The boundary scrubber did the heavy lifting. Every payload going to a model and every payload going to logs runs through a six-stage pipeline:
- A Presidio NER pass for names, addresses, dates, phones, emails, SSNs, and MRN patterns.
- A custom regex pass for institution-specific identifiers. This hospital's MRN is 8 digits plus a 2-letter prefix, so we wrote a dedicated pattern for it.
- A second check with Amazon Bedrock Guardrails sensitive-information filters, which can block or anonymize predefined PII types and custom regex at the platform level.
- Tokenization. Every detected element is replaced with a stable token like
[MRN-A47]so downstream calls see consistent references without raw PHI. - A confidence gate. If combined detector confidence drops below 0.97 on any payload, the request fails closed with a logged exception. Better to lose a request than to leak one.
- De-tokenization only at the final UI render layer, behind an authenticated session with full audit logging.
The other 3 findings were lifecycle gaps, not architecture: audit-log retention, KMS rotation cadence, and access-review documentation. We closed them with features AWS already provides. KMS automatic rotation went on, CloudWatch Logs retention was set to 7 years (above the six-year HIPAA documentation minimum defined in 45 CFR 164.530(j)), and a quarterly access-review job now emails the compliance officer.
What were the results after six months in production?
Here is the before-and-after on the workflows that moved, measured on run-rate after 90 days of full production traffic:
| Metric | Before | After |
|---|---|---|
| OCR audit findings | 17 | 0 |
| Prior-auth turnaround | 5.2 days | 1.4 days |
| Denial rate after appeal | 38% | 19% |
| Physician adoption | 41% (legacy tools) | 89% |
| Annualized cost on workflows moved | $2.6M | $1.4M |
The adoption jump surprised us most. The legacy tools sat at 41% physician usage. The rebuilt versions reached 89%. The biggest driver was latency: Bedrock plus VPC PrivateLink hits a median 380ms versus 4.1 seconds on the old setup. Physicians abandon a tool that makes them wait. For industry context, the AMA's 2024 augmented-intelligence survey found 66% of physicians using AI in practice, up from 38% in 2023, with administrative burden cited as the top opportunity. Compared to that baseline, an 89% adoption rate on a clinical surface is unusually high, and we attribute it to speed and to the tools not breaking under a compliance review.
Got AI tools touching PHI with no audit trail?
Bring your current AI inventory and your latest gap analysis. We come back in 48 hours with a written rebuild scope, a fixed price, and a target audit-clearance date.
What would we tell a health system starting this?
Three things, in order of importance:
- Settle the BAA first, before any architecture. Bedrock is covered by the standard AWS BAA you accept in AWS Artifact, which is fast. Third-party tool vendors are the slow part, so build for the worst-case vendor timeline.
- Treat boundary scrubbing as week-one work, not future work. Every shortcut we have seen here turned into an audit finding. The scrubber is the cheapest insurance in the whole build.
- Pick one partner who owns the full surface. We have rescued three health systems where five vendors each had a different definition of HIPAA-compliant and none of them matched.
Unlike a multi-vendor patchwork, a single accountable owner means one audit trail, one encryption story, and one phone number when OCR calls. That is the difference between passing a re-review and starting over.
Frequently asked questions
Is AWS Bedrock HIPAA-compliant?
Amazon Bedrock is HIPAA-eligible and covered by the standard AWS BAA, accepted in AWS Artifact. Eligibility is necessary but not sufficient. Compliance comes from the architecture: PHI scrubbing at the boundary, VPC PrivateLink, KMS encryption, and CloudWatch audit logs with HIPAA retention.
What does a HIPAA-compliant AI rebuild on AWS cost?
We typically scope $120,000 to $280,000 for a mid-sized hospital, depending on workflows in scope and physician training. This engagement was $187,000 over 11 weeks. Payback usually lands at 8 to 14 months on workflow savings alone.
Does this architecture work for payers, not just providers?
Yes. We have shipped similar architectures for two US health plans covering prior-auth automation and claims routing. The PHI scrubbing pattern is identical. Member identifiers replace MRNs, but the boundary scrubber is the same.
Does the model ever see raw PHI?
No. Tokenization at the boundary means Bedrock receives a stable token like [MRN-A47], not the real identifier. There is no fine-tuning on PHI in this design. We use prompt engineering and tool-calling against the EHR, not model training on patient data.
Get HIPAA-compliant AI on AWS, audited and signed off.
Book a free 30-minute HIPAA call. We will review your current AI surface and audit gaps, then come back with a fixed-price rebuild scope and a target clearance date.
Sources: Amazon Bedrock security and compliance, AWS: HIPAA compliance for generative AI, Amazon Bedrock Guardrails sensitive-information filters, 45 CFR 164.530 (Cornell LII), and the AMA augmented-intelligence survey. Client metrics are first-party from a single anonymized Braincuber engagement (July 2025 to April 2026), published with CIO permission; OCR details paraphrased to prevent identification. AWS Bedrock and Claude 3.7 Sonnet pricing referenced as of April 2026.
Written by Dhwani Tarwani, Co-founder & AI Practice Lead at Braincuber Technologies. Builds production AI agents on AWS Bedrock and Anthropic Claude for US healthcare, fintech, and retail clients under HIPAA-scope and SOC 2 Type II deployments.Founder and CEO of Braincuber. Has scoped and shipped 500+ Odoo, AI, and cloud projects for US mid-market and global brands. Takes every founder call personally — no SDR layer between buyers and the people building the system.
