Description
The Technical Manager for Site Reliability Engineering (SRE) will lead a remote team of Site Reliability Engineers, ensuring operational excellence and fostering a high-performing team culture. Reporting to the US-based Director of Systems and Security, this role is responsible for overseeing day-to-day operations, technical mentorship, and strategic alignment with the company’s goals. The Technical Manager will act as a bridge between the team and senior leadership, ensuring clear communication, efficient issue resolution, and continuous improvement in service delivery.
Responsibilities:
● Provide leadership and management to a remote team of Site Reliability Engineers, ensuring alignment with organizational priorities and goals.
● Oversee team operations, including incident management, technical support, and infrastructure maintenance.
● Act as the primary point of escalation for complex technical issues, collaborating with the Director of Systems and Security, Quality Assurance and Product teams as needed.
● Ensure the team adheres to established SLAs for issue resolution and maintains high customer satisfaction levels.
● Mentor and develop team members, fostering growth in technical skills, problem-solving abilities, and customer engagement.
● Lead initiatives to improve operational processes, tools, and workflows, driving greater efficiency and reliability.
● Collaborate with cross-functional teams, including Product, Engineering, and Operations, to address customer needs and improve platform performance.
● Facilitate regular team meetings, performance reviews, and one-on-one sessions to ensure clear communication and ongoing development.
● Maintain and report on key performance metrics, providing insights and recommendations to senior leadership.
● Stay informed on industry trends and best practices, ensuring the team is equipped with the latest tools and methodologies.
● Participate in strategic planning and contribute to the continuous improvement of the SRE function.
Qualifications:
● Proven experience managing technical teams, preferably in Site Reliability Engineering, DevOps, or a related field.
● Strong technical background in cloud computing and infrastructure management, particularly with AWS and Linux-based systems.
● Demonstrated ability to lead and mentor teams in remote and distributed environments.
● Excellent written and oral English communication and interpersonal skills, with the ability to engage effectively with both technical and non-technical stakeholders.
● Strong problem-solving and decision-making abilities, with a focus on root cause analysis and long-term solutions.
● Experience with automation tools (Terraform, Ansible, CloudFormation) and CI/CD pipelines.
● Familiarity with incident management practices and tools, as well as ticketing systems.
● High attention to detail and a commitment to operational excellence.
● Bachelor’s degree in a technical or quantitative science field, or equivalent work experience.
Preferred Qualifications:
● AWS certification (any level).
● Experience leading customer-facing technical teams, with a focus on improving service delivery.
● Knowledge of security best practices and governance in cloud environments.
● Strong understanding of networking concepts and system architecture.
Location : Onsite - Kolkata
Salary Range : 7 LPA
Interested candidates can send their Resume to swarnima@cloudhire.ai