Senior Site Reliability Engineer (SRE)

OP • Full-time • Bengaluru, IN • 1d ago

OP is partnering with a globally renowned leader in media, entertainment, and consumer experiences to hire a highly experienced and technically proficient Senior Site Reliability Engineer (SRE) to join their team. This team designs, implements, and supports infrastructure, tools, and services that power both internal and customer-facing applications across their global business units

As a Senior SRE, you will ensure the reliability, scalability, and performance of complex systems and applications. You ll bring deep expertise in cloud infrastructure, automation, and DevOps practices, and will mentor junior engineers, lead technical initiatives, and contribute to a culture of operational excellence.

Key Responsibilities

Infrastructure & Application Support.
- Design, implement, and support infrastructure and services for large-scale production workloads.
- Troubleshoot across the full stack: network, server, OS, application, database, storage, and identity/access management.
- Provide expert-level support for Linux-based systems and intermediate-level support for Windows environments.
Cloud Engineering & Automation.
- Lead deployment and management of environments in AWS (VPC, EC2, S3, Fargate, Lambda, CloudFront, ALB/ELB, IAM, RDS).
- Contribute to multi-cloud strategies with exposure to Azure and/or GCP.
- Develop infrastructure as code using Terraform and Chef.
- Build and maintain CI/CD pipelines using GitLab CI, Jenkins, or similar platforms.
Monitoring & Observability.
- Implement and manage monitoring platforms such as Datadog, New Relic, and CloudWatch.
- Analyze system performance and proactively identify areas for optimization.
Collaboration & Leadership.
Communicate effectively across teams and stakeholders through documentation, chat, and verbal updates.
- Mentor junior SREs and contribute to team development.
- Demonstrate technical leadership and ownership across projects and initiatives.

Basic Qualifications

Minimum 10 years of experience in Site Reliability Engineering or DevOps roles.
Expert-level experience with AWS and Linux server environments.
Proficiency in scripting and programming languages: Terraform, Chef, Perl, Python, Go, JavaScript.
Experience with container technologies (e.g., Docker).
Familiarity with data formats like JSON and XML.
Hands-on experience with code repositories (GitHub, GitLab).
Strong understanding of CI/CD practices and tools.
Excellent communication and documentation skills.
Proven ability to lead technical projects and mentor team members.

Preferred Qualifications

Exposure to Azure and/or Google Cloud Platform.
Experience with service mesh, Kubernetes, or serverless architectures.
ITIL or SRE-specific certifications.
Advanced training in cloud architecture or DevOps methodologies.

Education

Required: Bachelor s degree in Computer Science, Information Systems, Software Engineering, Electrical/Electronics Engineering, or equivalent professional experience.

Key Attributes

Strategic thinker with a broad perspective on cloud platforms and enterprise reliability.
Collaborative team player with a positive, inclusive attitude.
Results-driven with a focus on delivery, quality, and customer satisfaction.
Strong problem-solving skills and ability to work under pressure.
Commitment to the organization's values of innovation, excellence, and storytelling.

Decision-Making & Supervision

Operates with high autonomy and technical ownership.
Responsible for setting technical direction and mentoring peers.
Makes independent decisions on architecture, design, and implementation.
Collaborates with leadership for strategic alignment and resource planning.

Why Join Us?

At our organization, technology powers the magic. As part of our team, you ll help build and maintain the systems that support our iconic brands and global operations. We offer a collaborative work environment, opportunities for growth, and a culture that celebrates diversity, creativity, and innovation.