Title: SRE Devops infrastructure Engineer
Location: SFO, CA (5 Days Onsite) Local Candidates only final round face to face
Job Description:
We are seeking a SRE Infrastructure Resource having 8+ years ofprofessional experience ensuring the reliability, scalability, and performance of Google Cloud-based services through automation, monitoring, and proactive engineering. Key responsibilities include managing infrastructure as code (Terraform), optimizing GKE/Kubernetes, incident response, and implementing SLIs/SLOs to minimize manual toil.
This role requires close collaboration with cross‑functional teams, adherence to DevOps and Agile practices, and ownership of service quality and delivery.
Key Responsibilities
- GCP Infrastructure Management: Design, deploy, and maintain robust infrastructure components, including VPCs, Compute Engine, GKE (Kubernetes), and storage solutions.
- Automation & IaC: Utilize Terraform or Deployment Manager to manage cloud resources and build CI/CD pipelines to automate deployments. Minimizing manual, repetitive tasks by developing automation scripts and custom tools to streamline deployments and operations.
- Observability & Incident Management: Develop monitoring, alerting, and logging systems (e.g., Cloud Monitoring, Prometheus, Grafana). Act as primary on-call to troubleshoot production incidents.
- Incident Management: Serving as a first responder for system outages and conducting deep-dive root cause analysis (post-mortems) to prevent recurrence
- CI/CD Pipeline Management: Designing and supporting automated deployment pipelines using Jenkins, ArgoCD, Artifactory, DevSecOps, GitLab CI, or GitHub Actions
- Reliability Engineering: Define and maintain Service Level Indicators (SLIs) and Service Level Objectives (SLOs) - Latency, Traffic, Errors, and Saturation
- Optimization & Security: Proactively optimize infrastructure for cost, performance, and security compliance.
- Site Reliability Engineer, Google Cloud Engine AI SRE at Google: Focus specifically on AI workload health, and GCE visibility