Role: DevOps Engineer
Experience: 6-8 Years
Location: Bangalore
Job Requirements:
- Strong hands‑on experience in Azure and AWS cloud platforms.
- Deep knowledge of CI/CD tools: Azure DevOps, GitHub Actions, Jenkins, AWS DevOps suite.
- Proficiency in Infrastructure as Code: Terraform, CloudFormation, ARM/Bicep.
- Kubernetes administration (AKS, EKS, ECS), container orchestration, service mesh.
- Scripting: Python, Bash, PowerShell.
- Strong understanding of networking, load balancing, VPNs, routing, and cloud firewalls.
- Experience with monitoring tools: CloudWatch, Azure Monitor, Prometheus, Grafana, ELK.
Key Responsibilities:
Cloud Engineering (Azure & AWS)
- Design, deploy, and manage scalable cloud infrastructure across Azure and AWS.
- Implement secure, automated provisioning using IaC (Terraform, CloudFormation, ARM/Bicep).
- Build hybrid and multi‑cloud platform solutions supporting large‑scale enterprise workloads.
DevOps & Automation
- Design and maintain CI/CD pipelines (Azure DevOps, GitHub Actions, AWS CodePipeline).
- Implement automated build, test, deployment, and environment setup workflows.
- Promote DevOps best practices including GitOps, automated testing, and release governance.
AI / MLOps Responsibilities
- Deploy and operationalise AI/ML models in production using Azure Machine Learning, SageMaker, or containerised services.
- Implement MLOps pipelines for model training, validation, deployment, and monitoring.
- Manage model registry, versioning, data pipelines, and drift detection.
- Support AI workloads using GPU/compute optimisation, scalable inference endpoints, and secure data processing.
Platform Operations (PlatformOps)
- Ensure platform reliability, performance, incident response, and operational governance.
- Build platform‑level automation frameworks for self‑service provisioning, patching, and DR.
- Implement SRE practices including SLIs, SLOs, fault tolerance, and resilience engineering.
- Support Observability using Azure Monitor, CloudWatch, Prometheus, Grafana, ELK, or equivalent.
Containerisation & Orchestration
- Manage container platforms such as Docker, Kubernetes, AKS, EKS, or ECS.
- Implement scalable cluster management, upgrades, networking, secrets, and workload optimisation.
Security & Compliance
- Implement secure CI/CD, IAM policies, role‑based access (RBAC), secrets management.
- Ensure compliance with enterprise standards and regulatory controls.
- Support vulnerability management, patching, and audit-readiness.
Collaboration & Support
- Partner with architecture and AI engineering teams to build cloud‑native and AI‑first solutions.
- Provide environment support, root cause analysis, and platform troubleshooting.
- Contribute to technical design documents, standards, and best‑practice frameworks