SRE/DevOps --Lead I - DevOps Engineering

UST • Full-time • Pune Division, IN • 1d ago

Role Description

Job Title: Site Reliability Engineer (SRE)

Experience

6+ Years

Role Summary

Seeking an experienced Site Reliability Engineer to design, build, and operate highly available, scalable, and reliable cloud‑based systems. The role focuses on automation, CI/CD, monitoring, incident management, and improving overall system resilience in distributed environments.

Key Responsibilities

Manage system uptime, availability, and performance across cloud‑native and hybrid architectures
Design and implement Infrastructure as Code (IaC) using Terraform
Build and maintain CI/CD pipelines using Git and Jenkins
Automate deployments, including blue/green strategies
Develop automation scripts using Shell or Python
Implement monitoring, ing, and dashboards for microservices
Participate in on‑call rotations and handle production incidents
Lead blameless postmortems and drive preventive actions
Create and maintain detailed runbooks to reduce MTTR
Troubleshoot complex distributed systems and service dependencies

Required Skills & Experience

Strong experience with cloud platforms (AWS / GCP / Azure)
Hands‑on experience with Terraform and infrastructure automation
Experience provisioning compute, storage, and networking resources
Strong knowledge of CI/CD concepts and tools
Experience with monitoring and observability tools
Exposure to IAM, security, and access management
Strong understanding of systems, networking, storage, and databases
Experience working in Agile / DevOps environments

Education

Bachelor’s degree in Computer Science or related field, or equivalent practical experience

Skills (Keywords)

SRE, DevOps, Cloud Infrastructure, Terraform, CI/CD, Jenkins, Git, Automation, Monitoring, Incident Management, Kubernetes, AWS, GCP, Azure, Agile

Skills

site reliability engineering,terraform,cloud cli,cicd,