The problem
1,300+ creative professionals across 120 countries shared one bottleneck: on-premises GPU workstations took 3–5 business days to provision per request. Teams couldn't share AI models or workflows across offices, leading to duplicated effort and inconsistent output. Existing infrastructure couldn't scale to peak campaign demand, and there was no enterprise identity story for a 300+ GB custom model library.
What we shipped
A cloud-native ComfyUI platform on AWS with GPU-tiered EC2 — g6e.12xlarge (NVIDIA L40S, 48 GB VRAM) for image generation, p4de.24xlarge (A100, 80 GB VRAM) for video. Multi-region across us-east-1 and eu-north-1 ensures GDPR data residency. S3 holds the 300+ GB model library with versioning; EFS provides shared workflow storage. IAM Identity Center federated with the customer's IdP via SAML 2.0 with MFA enforcement. CloudWatch monitors GPU utilisation, inference latency, and cost; CloudTrail provides full API audit logging.
The outcome
GPU provisioning dropped from 3–5 business days to under 10 minutes — a 99%+ improvement. Per-image cost on L40S landed at $0.15–0.25 versus $2–5+ via external APIs (85–95% reduction). Per-user cost of ~$2,645/month delivers unlimited AI generation and replaces a stack of expensive external services.
Customer name redacted at the customer’s request. Numbers, services, and architecture are unchanged.