Job description
Sr. DevOps Engineer
Role
• Be the leader of the devops and system administration team.
• Be an individual contributor and be recognized as an expert that the company would look up to in all infrastructure related areas
• Build an excellent team of dedicated experts that can manage the company’s growing internet and on-premise resources
• Be responsible for all technical infrastructure of the company
• Be the controller for all technical and other third party cloud services subscribed to by the company
• Create the DevOps department as an excellent service provider with high reliability and quick turn-around time.
• Prepare and implement the technical deployment architecture
• Provide troubleshooting support and assistance for production issues
Accountabilities
• Take responsibility and be accountable for the reliable working of all production infrastructure with an uptime of over 99.99%. Ensure uptime for other environments such as Dev, QA, and Pre-production, as per the SLAs.
• Have a tight control on the infrastructure costs and ensure that optimal configurations are used and unwanted resources are released.
• Working across engineering teams to enable cost-effective, robust and secure delivery of platform services.
• Providing feedback to help the team understand technical trade-offs at the design and planning stage, particularly with respect to technical debt, performance, scalability and security.
• Performing the necessary platform development to enable and support the engineering team's goals as they evolve over the feature lifecycle, from defining requirements and specifications, developing proof-of-concepts, to ultimately delivering solutions to end users. This includes:
• Provisioning and maintenance of cloud infrastructure and environment
• Create and maintain CI/CD systems and pipelines
• Supporting engineers with training, guidance and standardised protocols for the deployment of new services that apply the relevant best practices
• Monitoring system health and managing events, alerting stakeholders when necessary
• Working closely with QA to ensure that systems are engineered such that they are observable and auditable across all relevant dimensions.
• Ensure adequate backup is taken as per the requirements of the disaster recovery SLAs
• Ensure that restoration drills are performed at regular intervals to check preparedness for disaster recovery as well as the quality of backup
• Ensure the company’s data and databases are well protected
• Maintain Deployment architecture and infrastructure documentation
Key Skills And Experience
At a minimum, candidates must have and experience and understanding with:
• Software development at different stages of product development, particularly early-stage prototypes, to minimum viable products
• Strong experience in infrastructure as code (IaC), software development, and continuous integration
• Proficiency in system administration and Linux administration
• CI/CD tools such as GitHub Actions, Jenkins, CircleCI, AWS CodeDeploy, AWS CodeBuild etc
• Experience with Relational and NO-SQL databases
• Should be good with Terraform, Terragrunt, Ansible, Microservices architectures.
• Should have fair idea of Database Administration with MySQL, MondgoDB
• Understanding of networking concepts and protocols
• Knowledge of containerization technologies such as Docker and Kubernetes
• Experience with cloud platforms such as AWS, Digital Ocean, Utho, Azure, or GCP
• Ability to troubleshoot and resolve issues in a timely manner
• Excellent communication and collaboration skills
• Relevant cloud certifications (e.g., AWS Certified DevOps Engineer, Certified Kubernetes Administrator) are a plus
Candidates will preferably also have
• Extensive experience with data-intensive systems
• Experience working in start-up or R&D environments
Skills: infrastructure,aws,cloud,devops,ci,cd,data,databases,architecture