Job Title: AWS Cloud Site Reliability Engineer (AWS SRE)
Location: Philadelphia PA Onsite hybrid
Full Time
USC/GC Only
Must Have Technical/Functional Skills:
- Design, implement, and manage scalable, secure, and resilient infrastructure on AWS.
- Expertise in various AWS services (e.g., EC2, S3, Route 53, Load Balancer, ASG, ACM, RDS, AWS Batch, Cloud Watch – Logs & Metrics, etc.) with 5+ years’ experience working in AWS infrastructure.
- 5+ years of working experience in leveraging AWS Server less services (e.g., Lambda, API Gateway, Step Functions), Data Lake services (e.g., S3, Glue, Lake Formation, Redshift), Machine learning services (e.g. Sage maker).
- 5+ years of working experience in Container-based services (e.g., Kubernetes/EKS, ECR, Docker) managed using Helm chart.
- Develop and maintain Infrastructure as Code (IaC) using Terraform with 5+ years’ working experience in Terraform.
- Ensure best practices in version control using Git/GitHub.
- Build and optimize CI/CD pipelines for efficient and reliable software delivery using Jenkins, deploy, Manage Artifacts using jFrog, minimum 3+ years of working experience.
- Manage and troubleshoot VPCs, networking configurations, and hybrid cloud connectivity.
- Administer and support Linux and Windows-based systems in cloud environments.
- Write and maintain automation scripts using Bash and Python.
- Collaborate with enterprise architects to align infrastructure with business and technical goals.
- Interpret and contribute to infrastructure diagrams and technical documentation.
- Participate in incident response, root cause analysis, and on-call rotations.
- Integrate and manage SSO (Single Sign-On) solutions for secure access control.
- Collaborate with development, QA, and DevOps teams to ensure smooth deployment processes.
Roles & Responsibilities:
- Design, implement, and maintain scalable and secure AWS cloud infrastructure to support a variety of applications and services.
- Collaborate closely with cross-functional teams, including software engineers and DevOps professionals, to architect and deploy AWS solutions that meet project requirements.
- Conduct regular performance monitoring and optimization of AWS resources to ensure cost efficiency and reliability.
- Stay updated on the latest AWS services, features, and best practices, incorporating them into cloud architecture and development processes.
- Implement and enforce AWS security measures, including IAM policies and access controls, to protect sensitive data and infrastructure.
- Experience in code development in at least one programming language (preferably python).
- Strong experience with using infrastructure as a code (Terraform).
- Strong experience with Configuration Management Tool (Ansible).
- Extensive knowledge of Jenkins, CI/CD pipeline and tools like deploy.
- Experience working in Kubernetes/EKS.
- Actively participate in code reviews, providing feedback on AWS infrastructure as code (IAC) and configurations to enhance system stability and security.
- Troubleshoot and resolve AWS-related issues, ensuring uninterrupted operation of applications and services hosted on AWS.