We are looking for a Cloud Infra Architect with AWS to lead the architecture and implementation for Launch Darkly. This person must have prior experience in leading this implementation and roll out to application teams at large scale.Launch Darkly Feature Flags with Progressive rollouts, A/B Testing with Zero Downtime & Full AutomationThis document outlines the comprehensive requirements for vendor partners to support initiatives aimed at achieving zero downtime, reducing production incidents, improving change failure rate metrics, and enabling full automation. The scope includes feature flags, A/B testing, progressive rollout, and support for deployment patterns across APIs, EKS, OnPrem, Lambdas, and other AWS services.Functional ScopeZero Downtime DeploymentsImplement blue/green or canary deployment models with seamless traffic switching, rollback capability, and session persistence.Change Failure Rate ReductionIntegrate root cause tracking, automated rollback, and pre-deployment validation pipelines.Feature FlagsEnable real-time toggling, secure access control, and auditability. Must support both server-side and client-side toggles.A/B and B/G TestingSupport traffic segmentation, real-time metrics, rollback, and privacy compliance.Progressive RolloutsAutomate staged rollouts by region, user cohort, or environment. Include rollback triggers based on metrics.Automation & CI/CDFull GitHub Actions integration, dynamic runners, and golden path patterns for EKS, Lambda, and OnPrem.Environment PatternsSupport for APIs, EKS, OnPrem, Lambdas, Kafka, Glue, RDS, S3, and other AWS services.Observability & MetricsIntegrate with Grafana, Splunk, and DORA metrics (lead time, change frequency, failure rate, MTTR).Self-Service Enablement & Onboarding/Migration support for feature flagsEmpower teams with Express Lane-style pipelines, role-based access, and audit trails.Expected Outcomes- Pilot with at least 5 teams by Nov'2025- We need enterprise adoption ready by Nov with at least 5 Patterns inclusive of Cloud & OnPrem- 99.9%+ availability during deployments.- 99%+ reduction in change failure rate.- Full automation of provisioning, testing, and deployment pipelines- Full automation and governance for E2E feature flag lifecycle management Non-Functional Requirements (NFRs)PerformanceLow-latency toggling, fast rollback, SecuritySecure artifact storage, RBAC, audit logging, vulnerability scanning.ScalabilitySupport for multi-region, multi-tenant deployments; dynamic scaling of runners.ResilienceChaos testing, fault injection, recovery time objectives (RTOs).ComplianceTagging enforcement, cost visibility, and privacy compliance for A/B testing, BlueGreen, Flags