Lead Site Reliability Engineer

Liberty Personnel Services, Inc. • Full-time • Philadelphia, PA, US • 4d ago

Job Details:

Lead Site Reliability Engineer

The Lead Site Reliability Engineer is a senior technical leader responsible for the reliability, availability, and operational excellence of a cloud-based infrastructure and distributed platform. This role owns uptime, SLAs, and incident response while driving long-term improvements in resilience, observability, and automation. The Lead SRE is hands-on and partners closely with platform, QA, and development teams.

This role suits an engineer who thrives in high-ownership environments, balancing real-time operations with strategic reliability initiatives. You’ll define operational standards, disaster recovery practices, and automation frameworks, while leading incidents and postmortems with clarity and accountability.

Key Responsibilities

Own uptime, SLAs, and overall platform reliability
Lead incident response, root-cause analysis, and postmortems
Automate infrastructure, deployments, and operational workflows
Improve monitoring, alerting, and observability
Execute and evolve disaster recovery and business continuity plans
Optimize cloud and Kubernetes environments for scale and performance
Establish runbooks, operational standards, and reliability best practices
Provide technical leadership and mentorship

Qualifications

6+ years in SRE, DevOps, or Platform Engineering; 2+ years in a lead role
Strong experience supporting production systems with strict SLAs
Deep expertise in Kubernetes, containers, and cloud infrastructure
Proficiency with Terraform and modern IaC practices
Strong automation and scripting skills (Bash, Python, or Go)
Experience with CI/CD, GitOps, and observability tooling
Proven incident leadership and cross-functional communication skills

If you are interested in learning more please send a copy of your resume to jz@libertyjobs.com.

Josh Zeloyle

www.libertyjobs.com

610-684-8676

jz@libertyjobs.com

https://www.linkedin.com/in/joshuazeloyle/

#sre

#devops