Site Reliability Engineer (SRE) – Cloud / Video Platforms (AWS, Multi-Cloud)
Role Overview
We are seeking a Site Reliability Engineer (SRE) to join our Video Network division, where you will design, build, and operate our next-generation Video Cloud platform. This role focuses on driving reliability, scalability, automation, and cost efficiency across cloud environments, primarily AWS, with exposure to Azure and GCP.
You will work closely with Engineering, Product, Sales, and customers, playing a critical role in onboarding, production support, automation, and continuous improvement of cloud and Kubernetes-based platforms.
Key Responsibilities
Cloud & Platform Operations
- Design, build, and operate scalable, secure cloud infrastructure across AWS (primary), Azure, and GCP
- Deploy and support solutions across POC, Staging, and Production environments
- Manage Incidents, Service Requests, Problems, and Change Requests in cloud environments
- Monitor system performance, anticipate scaling needs, and ensure high availability
Reliability, Automation & Observability
- Drive automation through scripts, tools, and CI/CD pipelines to reduce manual operations
- Implement and maintain monitoring and observability frameworks
- Proactively identify, troubleshoot, and resolve moderate to complex technical issues
- Replicate and analyze issues in lab environments to validate fixes
Cost & Performance Optimization
- Define and track KPIs for cloud utilization, performance, and cost efficiency
- Build cost optimization dashboards and automation at both infrastructure and Kubernetes levels
- Provide recommendations to control and optimize cloud spend
Collaboration & Customer Support
- Lead and support customer onboarding, including environment setup and configuration
- Provide technical support to partners and customers using Synamedia technologies
- Deliver technical presentations, documentation, and cross-training sessions
- Collaborate with Engineering, Sales, and Product teams to improve platform quality and customer experience
Required Skills & Experience
Technical Skills
- Strong experience as an SRE, Cloud Engineer, or DevOps Engineer
- Hands-on experience with AWS (EC2, EKS, IAM, VPC, CloudWatch, etc.)
- Exposure to Azure and/or GCP
- Experience with Kubernetes, containerized workloads, and cloud-native architectures
- Automation and scripting skills (e.g., Python, Bash, Terraform, CloudFormation)
- CI/CD pipelines and deployment automation
- Monitoring and observability tools (Prometheus, Grafana, Datadog, CloudWatch, etc.)
Soft Skills
- Strong analytical and troubleshooting skills
- Excellent written and verbal communication skills
- Customer-facing mindset with strong engagement and presentation skills
- Highly adaptable and able to perform under pressure in fast-changing environments
- Proactive, self-driven, and eager to continuously learn and innovate
- Highly organized with the ability to manage multiple priorities and escalations
Nice to Have
- Experience with Video platforms or streaming technologies
- Multi-cloud production experience
- Cost optimization and FinOps exposure