AI/ML & Intelligent Systems

Built 16 Terraform modules composing a complete AI stack (Bedrock, SageMaker, pgvector) deployable to any environment in minutes
Designed Bedrock Flows orchestrating multi-step AI interactions with built-in guardrails for content safety in a healthcare context
Implemented RAG pipeline with pgvector on Aurora Serverless v2 delivering sub-second semantic search across thousands of content documents
Applied the same compliance controls (FIPS containers, vulnerability scanning, attestation) to AI infrastructure as the core platform
Built drift detection CI workflows catching manual console changes to AI resources before they diverge between environments

The Challenge

The platform needed multiple AI/ML capabilities: session transcript analysis, peer support chatbots, intelligent content search, and vocal biomarker assessment. Each AI feature required its own infrastructure (inference endpoints, knowledge bases, guardrails), and all of it needed to meet the same compliance standards as the rest of the platform. The problem wasn't the AI itself. It was making AI a first-class platform service rather than a collection of one-off experiments.

Approach & Role

I designed the infrastructure layer for all AI/ML features. Every capability is Terraform-managed with the same compliance controls as any other service: FIPS containers, vulnerability scanning, least-privilege IAM, and automated deployment pipelines. The application-level AI development was collaborative with the development team, but the infrastructure, deployment, and production readiness were my responsibility.

Architecture & Patterns

Full-stack AI platform (Bedrock + ECS):

16 Terraform modules for a complete AI deployment: VPC, ECS, ECR, ElastiCache, S3, and 4 Bedrock-specific modules (Knowledge Base, Flow, Guardrail, Prompt)
Bedrock Flows orchestrate multi-step AI interactions with built-in guardrails for content safety
Lambda functions for session management, persona routing, and event-driven alerting (726-line handler to Google Chat webhooks)
Terraform drift detection CI workflow catches manual console changes

ML inference infrastructure:

SageMaker async inference endpoints with auto-scaling for large language models
Custom GPU Docker images for model serving (optimized for clinical summarization and emotion detection)
EventBridge integration with token expiry guards for reliable async communication

Intelligent search (RAG):

MCP protocol server for AI agent tool orchestration
pgvector with Aurora Serverless v2 for semantic search (with FAISS fallback)
Incremental content sync from CMS with scheduled background processing
JWT authentication via existing identity provider, no separate auth system for AI features

Data pipeline:

AWS Glue ETL jobs replicating production data to analytics environments
PII obfuscation (MD5 hashing of names, emails, phone numbers) before analytics ingestion
Schema drift detection with incremental sync (only changed records transferred)

Impact & Scale

16 Terraform modules composing a complete AI stack, deployable to any environment in minutes
Multiple LLM inference endpoints (summarization, emotion detection, peer support) with auto-scaling
RAG pipeline processing thousands of content documents with sub-second query response
All AI infrastructure under the same compliance controls as the core platform (FIPS, scanning, attestation)
Drift detection prevents configuration divergence between environments