Jobright is an AI-powered career platform that helps job seekers discover the top opportunities in the US. We are NOT a staffing agency. Jobright does not hire directly for these positions. We connect you with verified openings from employers you can trust.
Job Summary:
Comcast is a Fortune 30 global media and technology company seeking a talented and experienced Site Reliability Engineer (SRE) to join their Containers Platform Team. The role focuses on ensuring the reliability, scalability, and performance of containerized infrastructure while collaborating with cross-functional teams to design and maintain complex systems for their cloud platform.
Responsibilities:
• You will be responsible for the deployment of containerized applications across a cluster of bare-metal servers, facilitating automatic scaling of containerized applications based on demand, and managing utilization of compute resources such as CPU, memory, and storage across the cluster.
• Implementing monitoring solutions (e.g., Prometheus, Grafana) to track the health and performance of bare metal clusters and infrastructure components. You should be able to set up alerting mechanisms to detect and respond to issues proactively and recover from failures by restarting failed containers or reallocating workloads to healthy nodes.
• Working closely with development and engineering teams to establish CI/CD pipelines for automating the deployment and rollout of Kubernetes services. Support seamless rolling updates allowing new versions to be deployed gradually while maintaining application availability.
• Identifying performance bottlenecks in containerized environments and optimizing resource utilization through capacity planning, auto-scaling, and performance tuning.
• Documenting processes, procedures, and best practices related to the platform operations and sharing knowledge with team members.
Qualifications:
Required:
• Bachelor’s degree in computer science or a related field, or equivalent experience, typically 2 years in a Site Reliability Engineering, DevOps, or Systems Engineering role.
• Must be familiar with containerized technologies such as Kubernetes, container, and/or nerdctl. This includes the ability to deploy, manage, and scale containerized applications effectively.
• Intermediate experience implementing continuous integration and continuous delivery (CI/CD) tools and systems.
• Proficiency in programming languages such as: Shell scripting (Bash), and familiarity with YAML/JSON.
• Automation scripting with tools such as Ansible playbooks or similar imaging solutions.
• General understanding of networking fundamentals, including TCP, DNS, UDP, IPv4/IPv6 networking, Load Balancing, and protocols. Understanding IP networking and traffic scaling is also important.
• Excellent analytical and problem-solving skills with the ability to effectively communicate complex technical information.
• Strong written communication skills are essential, as well as the ability to create clear and informative documentation.
• Ability to work effectively across internal and external organizations.
• Flexibility to work off-hours for on-call duties. SREs are often responsible for maintaining the reliability and availability of systems outside of standard working hours, so the ability to respond to incidents and perform maintenance tasks as needed is required.
Preferred:
• While possessing the stated degree is preferred, Comcast also may consider applicants who hold some combination of coursework and experience, or who have extensive related professional experience.
Company:
Comcast is a media and technology company that connects millions of people to the moments and experiences that matter most. It is a sub-organization of SkyShowtime. Founded in 1963, headquartered in Philadelphia, Pennsylvania, USA, team size 10001+ employees, currently Public Company. Comcast has a track record of offering H1B sponsorships.