Overview
We are looking for a Principal Cloud Architect who operates at the intersection of deep infrastructure engineering, platform reliability, and strategic solution design. This is a high-impact senior individual contributor position ; you will be the organisation's foremost expert in diagnosing and resolving complex infrastructure incidents, designing cloud modernisation blueprints, and continuously raising the engineering bar. You will architect across AWS and Azure at an expert level, champion DevOps and SRE culture, lead cloud-native platform decisions, and serve as a technical thought leader on emerging technologies including AI-driven infrastructure and FinOps practices. This role is both hands-on and strategic. You are expected to write code, build prototypes, own architectural artefacts, and actively mentor senior engineers ; while also influencing technology roadmaps and cross-functional engineering decisions at a principal level.
Responsibilities
- Troubleshooting & reliability
- Own resolution of critical infra incidents across AWS & Azure
- Lead RCAs and produce actionable post-mortems
- Define and enforce SLOs, SLIs, and error budgets
- Build runbooks, playbooks, and on-call frameworks
- Cloud architecture
- Design scalable, secure architectures for cloud workloads
- Architect hybrid and multi-cloud connectivity models
- Create reference architectures and golden paths
- Lead architectural reviews and produce ADRs
- Infra modernisation
- Drive migration from legacy to cloud-native systems
- Champion IaC adoption at scale (Terraform / Bicep)
- Mature Kubernetes platform across EKS and AKS
- Lead FinOps and cloud cost optimisation initiatives
- DevOps, observability & AI
- Define CI/CD, GitOps, and developer platform standards
- Drive observability using Grafana, Prometheus, OpenTelemetry
- Architect AI/ML-ready infra and integrate AIOps tooling
- Mentor engineers and influence the technology roadmap
Qualifications
- Must have Expert in AWS and Azure architecture, networking, security Deep Kubernetes knowledge (EKS, AKS, RBAC, service mesh) Strong cloud networking (VPC/VNet, BGP, Private Link, ZTA) IaC at scale : Terraform, Pulumi, or CloudFormation/Bicep SRE practices : SLO/SLI, error budgets, chaos engineering Observability stack : Grafana, Prometheus, OpenTelemetry Scripting in Python and Shell/Bash Config management with Ansible (AWX/Tower)
- Good to have
AI/ML infra
AIOps
FinOps tools
Databricks / Kafka
Go / TypeScript
Edge computing
11+ years in infra / cloud engineering (8+ in architecture)
Led modernisation programmes end-to-end
Owned P0/P1 incident resolution at scale
Degree in CS/IT or equivalent practical experience
- Preferred certifications AWS SA - Pro AZ-305 CKA / CKS Terraform Associate AI-102 FinOps CP