Role Description
Role Proficiency:
Act under guidance of Lead II/Architect understands customer requirements and translate them into design of new DevOps (CI/CD) components. Capable of managing at least 1 Agile Team
Outcomes:
- Interprets the DevOps Tool/feature/component design to develop/support the same in accordance with specifications
- Adapts existing DevOps solutions and creates own DevOps solutions for new contexts
- Codes debugs tests documents and communicates DevOps development stages/status of DevOps develop/support issues
- Select appropriate technical options for development such as reusing improving or reconfiguration of existing components
- Optimises efficiency cost and quality of DevOps process tools and technology development
- Validates results with user representatives; integrates and commissions the overall solution
- Helps Engineers troubleshoot issues that are novel/complex and are not covered by SOPs
- Design install configure troubleshoot CI/CD pipelines and software
- Able to automate infrastructure provisioning on cloud/in-premises with the guidance of architects
- Provides guidance to DevOps Engineers so that they can support existing components
- Work with diverse teams with Agile methodologies
- Facilitate saving measures through automation
- Mentors A1 and A2 resources
- Involved in the Code Review of the team
Measures Of Outcomes:
- Quality of deliverables
- Error rate/completion rate at various stages of SDLC/PDLC
- # of components/reused
- # of domain/technology certification/ product certification obtained
- SLA for onboarding and supporting users and tickets
Outputs Expected:
Automated components :
- Deliver components that automat parts to install components/configure of software/tools in on premises and on cloud
- Deliver components that automate parts of the build/deploy for applications
Configured Components:
- Configure a CI/CD pipeline that can be used by application development/support teams
Scripts:
- Develop/Support scripts (like Powershell/Shell/Python scripts) that automate installation/configuration/build/deployment tasks
Onboard Users:
- Onboard and extend existing tools to new app dev/support teams
Mentoring:
- Mentor and provide guidance to peers
Stakeholder Management:
- Guide the team in preparing status updates keeping management updated about the status
Training/SOPs :
- Create Training plans/SOPs to help DevOps Engineers with DevOps activities and in onboarding users
Measure Process Efficiency/Effectiveness:
- Measure and pay attention to efficiency/effectiveness of current process and make changes to make them more efficiently and effectively
Stakeholder Management:
- Share the status report with higher stakeholder
Skill Examples:
- Experience in the design installation configuration and troubleshooting of CI/CD pipelines and software using Jenkins/Bamboo/Ansible/Puppet /Chef/PowerShell /Docker/Kubernetes
- Experience in Integrating with code quality/test analysis tools like Sonarqube/Cobertura/Clover
- Experience in Integrating build/deploy pipelines with test automation tools like Selenium/Junit/NUnit
- Experience in Scripting skills (Python/Linux/Shell/Perl/Groovy/PowerShell)
- Experience in Infrastructure automation skill (ansible/puppet/Chef/Powershell)
- Experience in repository Management/Migration Automation – GIT/BitBucket/GitHub/Clearcase
- Experience in build automation scripts – Maven/Ant
- Experience in Artefact repository management – Nexus/Artifactory
- Experience in Dashboard Management & Automation- ELK/Splunk
- Experience in configuration of cloud infrastructure (AWS/Azure/Google)
- Experience in Migration of applications from on-premises to cloud infrastructures
- Experience in Working on Azure DevOps/ARM (Azure Resource Manager)/DSC (Desired State Configuration)/Strong debugging skill in C#/C Sharp and Dotnet
- Setting and Managing Jira projects and Git/Bitbucket repositories
- Skilled in containerization tools like Docker/Kubernetes
Knowledge Examples:
- Knowledge of Installation/Config/Build/Deploy processes and tools
- Knowledge of IAAS - Cloud providers (AWS/Azure/Google etc.) and their tool sets
- Knowledge of the application development lifecycle
- Knowledge of Quality Assurance processes
- Knowledge of Quality Automation processes and tools
- Knowledge of multiple tool stacks not just one
- Knowledge of Build Branching/Merging
- Knowledge about containerization
- Knowledge on security policies and tools
- Knowledge of Agile methodologies
Additional Comments:
Site Reliability Engineer (SRE) - Cloud-Native Services (Final) About the Role We are seeking a highly motivated and experienced Site Reliability Engineer (SRE) to join our core engineering team. This role is critical to maintaining the reliability, performance, and scalability of our modern, cloud-native application ecosystem. The ideal candidate will possess a strong blend of software engineering skills and deep operational knowledge, dedicated to reducing toil and driving system improvements through automation, with a specific focus on security compliance and SLA-driven operational excellence. ________________________________________ Key Responsibilities
- Deployment Automation (CI/CD): Develop, maintain, and automate robust deployment pipelines using tools like Jenkins, SonarQube, and Maven/ANT, ensuring fast, reliable, and safe production rollouts.
- Compliance & Security Remediation: Own the process of analyzing, prioritizing, and remediating vulnerabilities/findings identified in security scanning tools (Qualys, Sentinel One, Wiz). Ensure continuous compliance across multiple regions and distinct environments (NPE/PROD), specifically maintaining or exceeding the target compliance rate of 95%.
- Infrastructure Management: Manage, scale, and secure cloud infrastructure using CloudFormation and other IaC best practices. This includes implementing and automating AMI Rehydration processes.
- Incident Response & Management: Act as a primary responder during critical events, participating in on-call rotations using PagerDuty. Triage and resolve incidents originating from multiple geographies and business units quickly and effectively to ensure resolution within defined Service Level Agreements (SLAs). Conduct thorough post-mortems.
- Operational Excellence & Toil Reduction: Maintain and improve service availability, latency, and efficiency. Design, develop, and implement solutions to automate repetitive manual tasks.
- Observability: Implement and manage comprehensive monitoring, logging, and ing solutions (SLOs/SLIs) using tools like ELK, Splunk, and DataDog. ________________________________________ Required Qualifications Experience & Methodology
- 5+ years of experience working as an SRE with full project lifecycle experience.
- Experience in configuring, building, and supporting applications and operations in a public cloud environment (AWS, GCP, Azure).
- Strong exposure to Agile and Scaled Agile based development models.
- Demonstrated ability to work effectively in a fast-paced, high-volume, deadline-driven environment. Technical Skills
- Cloud Infrastructure: Good knowledge of cloud infrastructure (cloud services, security, IAM, VPC), and provisioning tools like CloudFormation, Terraform, or Ansible.
- Containerization & Orchestration: Expertise with Kubernetes (EKS), and container scheduler services such as ECS or GKE/Docker.
- Compliance & Platform: Demonstrated knowledge of the compliance process and remediation experience with Qualys, Sentinel One, and Wiz Reports, and practical experience with AMI Rehydration.
- Programming & Scripting: Excellent coding skills in at least one high-level language (e.g., Python, Go, Java) and scripting languages such as Unix Shells, Perl, Shell, bash, ksh. ○ Experience in one or more of the following: .NET based app development; Java based app development.
- CI/CD & SCM Ecosystem: Extensive experience with continuous integration tools and Source Code Management (SCM): ○ CI Tools: Jenkins, SonarQube, JIRA, Nexus, Confluence, Maven/ANT, Gradle. ○ SCM: Experience performing source code control management using Bitbucket/GIT (branching, merging, tagging, etc.). ○ Configuration Management: Experience in automation using Chef, Puppet or another SCM tool.
- Monitoring & Logging: Experience with tools like Elastic Search, ELK, Data Dog, PagerDuty, AppDynamics, Splunk, etc.
Skills
Aws,CI,CD,Scripting