AI/ML & Intelligent Systems
- Built 16 Terraform modules composing a complete AI stack (Bedrock, SageMaker, pgvector) deployable to any environment in minutes
- Designed Bedrock Flows orchestrating multi-step AI interactions with built-in guardrails for content safety in a healthcare context
- Implemented RAG pipeline with pgvector on Aurora Serverless v2 delivering sub-second semantic search across thousands of content documents
- Applied the same compliance controls (FIPS containers, vulnerability scanning, attestation) to AI infrastructure as the core platform
- Built drift detection CI workflows catching manual console changes to AI resources before they diverge between environments
The Challenge
The platform needed multiple AI/ML capabilities: session transcript analysis, peer support chatbots, intelligent content search, and vocal biomarker assessment. Each AI feature required its own infrastructure (inference endpoints, knowledge bases, guardrails), and all of it needed to meet the same compliance standards as the rest of the platform. The problem wasn't the AI itself. It was making AI a first-class platform service rather than a collection of one-off experiments.
Approach & Role
I designed the infrastructure layer for all AI/ML features. Every capability is Terraform-managed with the same compliance controls as any other service: FIPS containers, vulnerability scanning, least-privilege IAM, and automated deployment pipelines. The application-level AI development was collaborative with the development team, but the infrastructure, deployment, and production readiness were my responsibility.
Architecture & Patterns
Full-stack AI platform (Bedrock + ECS):
- 16 Terraform modules for a complete AI deployment: VPC, ECS, ECR, ElastiCache, S3, and 4 Bedrock-specific modules (Knowledge Base, Flow, Guardrail, Prompt)
- Bedrock Flows orchestrate multi-step AI interactions with built-in guardrails for content safety
- Lambda functions for session management, persona routing, and event-driven alerting (726-line handler to Google Chat webhooks)
- Terraform drift detection CI workflow catches manual console changes
ML inference infrastructure:
- SageMaker async inference endpoints with auto-scaling for large language models
- Custom GPU Docker images for model serving (optimized for clinical summarization and emotion detection)
- EventBridge integration with token expiry guards for reliable async communication
Intelligent search (RAG):
- MCP protocol server for AI agent tool orchestration
- pgvector with Aurora Serverless v2 for semantic search (with FAISS fallback)
- Incremental content sync from CMS with scheduled background processing
- JWT authentication via existing identity provider, no separate auth system for AI features
Data pipeline:
- AWS Glue ETL jobs replicating production data to analytics environments
- PII obfuscation (MD5 hashing of names, emails, phone numbers) before analytics ingestion
- Schema drift detection with incremental sync (only changed records transferred)
Impact & Scale
- 16 Terraform modules composing a complete AI stack, deployable to any environment in minutes
- Multiple LLM inference endpoints (summarization, emotion detection, peer support) with auto-scaling
- RAG pipeline processing thousands of content documents with sub-second query response
- All AI infrastructure under the same compliance controls as the core platform (FIPS, scanning, attestation)
- Drift detection prevents configuration divergence between environments