We are seeking an experienced DevOps Engineer with 7+ to design, implement, and manage scalable, secure, and highly available cloud solutions across Google Cloud Platform (GCP), Amazon Web Services (AWS), and Microsoft Azure.
This role requires deep expertise in multi-cloud architecture, DevOps automation, Kubernetes, observability, and infrastructure as code, with a strong focus on enterprise-scale deployments, cost optimization, and reliability engineering (SRE practices).
Key Responsibilities
Cloud Architecture & Design
- Design and implement multi-cloud architectures across GCP, AWS, and Azure
- Define enterprise cloud landing zones and multi-project structures
- Build high availability, fault-tolerant, and disaster recovery (DR) solutions
- Architect secure and scalable network topologies (VPCs, subnets, peering, hybrid connectivity)
Infrastructure as Code (IaC)
- Develop reusable infrastructure modules using Terraform / Terragrunt
- Implement modular, scalable, and DRY architecture patterns
- Manage remote state, version control, and environment separation
- Automate provisioning across multiple environments and projects
DevOps & CI/CD Automation
- Design and manage CI/CD pipelines using:
- GitHub Actions / GitLab CI / Jenkins / Azure DevOps
- Automate:
- Infrastructure deployments
- Application release pipelines
- Implement blue-green / canary deployment strategies
Kubernetes & Containerization
- Manage Kubernetes platforms:
- GKE, EKS, AKS
- Deploy workloads via Helm, manifests, and Kustomize
- Troubleshoot production issues:
- Ensure high availability and scalability of containerized applications
Monitoring, Logging & Observability
- Implement centralized observability using:
- GCP Cloud Monitoring / AWS CloudWatch / Azure Monitor
- Configure:
- Metrics, dashboards, alerting policies
- Log-based metrics and custom alerts (CPU, memory, DB, network latency)
- Design centralized logging architecture (logs, sinks, retention policies)
- Enable team-based alert routing (Email, Slack, PagerDuty)
Security & Compliance
- Implement IAM roles, RBAC, and least privilege access
- Manage secrets using:
- Secret Manager / Azure Key Vault / AWS Secrets Manager
- Ensure compliance with enterprise standards:
- ISO, SOC2, HIPAA (where applicable)
- Implement encryption, audit logging, and access governance
Cost Optimization & Governance
- Monitor and optimize cloud costs across multiple projects
- Configure:
- Budget alerts
- Billing monitoring and cost allocation models
- Identify and implement resource optimization strategies
Automation & SRE Practices
- Define and implement SLIs, SLOs, and error budgets
- Build automated alerting and remediation workflows
- Improve platform reliability and system uptime
- Support incident response and root cause analysis (RCA)
Scripting knowledge on python, node.js and Golang will be added advantage