Job Summary
We are looking for an experienced Devops engineer to join our team. As an Devops engineer, you will be responsible for ensuring the uptime and reliability of our web services and applications by proactively monitoring, measuring, and improving service availability and performance. You will work closely with software development teams to build and deploy new features and services, and troubleshoot issues that arise. Additionally, you will be responsible for automating and improving deployment, monitoring, and testing processes using tools such as Kubernetes, Docker, and Terraform. You will participate in on-call rotations to provide 24/7 support for critical system and conduct post-mortem analysis of incidents to identify root causes and prevent similar issues from occurring in the future.
Key Responsibilities
- Ensure uptime and reliability of our web services and applications
- Proactively monitor, measure, and improve service availability and performance
- Automate and improve deployment, monitoring, and testing processes using tools such as Kubernetes, Docker, and Terraform
- Work closely with software development teams to build and deploy new features and services
- Troubleshoot issues that arise in our production environment
- Implement and manage incident response procedures to minimize the impact of outages or service disruptions
- Participate in on-call rotations to provide 24/7 support for critical systems
- Conduct post-mortem analysis of incidents to identify root causes and prevent similar issues from occurring in the future
- Stay up-to-date with emerging technologies and industry trends, and continuously improve skills and knowledge
Technical Skills
- You can design and architect enterprise and/or web-scale hosting platforms and can seamlessly
- administer application servers, web servers and databases
- You have a deep understanding of cloud, virtualization and container (Kubernetes) platforms,
- infrastructure automation(Terraform) and application hosting technologies
- You understand DevOps philosophy, Agile methods, Infrastructure as Code to your work and
- lead infrastructure and operations with these approaches
- You have a history working with server virtualisation, IaaS and PaaS cloud, Infrastructure
- provisioning and configuration management tools
- You can write scripts using at least one scripting language and are comfortable with building Linux and/or Windows servers systems
- Experience with continuous integration tools with different tech stacks, web or mobile
- You've previously worked with monitoring systems for stress and performance testing with Observability
- Pattern: Distributed Tracing/ OpenTracing, Log Aggregation, Audit Logging, Exception Tracking,
- Health Check API, Application MetricS, Self-Healing/Multi-Cloud
- Bonus points if you have experience with unit testing and automated testing tool.
Requirements
- Bachelor's or Master's degree in Computer Science or a related field, or equivalent work experience
- 5+ years of experience as an SRE, systems administrator, or devops
- Strong understanding of Linux/Unix administration and system-level issues
- Proficient with containerization technologies such as Docker and Kubernetes
- Experience with cloud computing platforms such as AWS, Azure, or GCP and IAC tools
- Strong networking skills, including TCP/IP, DNS, load balancing, and security
- Experience with automation tools such as Ansible, Puppet, or Chef
- Proficient in one or more programming languages such as Python, Java, or Go
- Experience with monitoring and alerting tools such as Prometheus, Grafana, or Nagio
- Work closely with software development teams to build and deploy new features and services
- Good understanding of security concepts and best practices
- Excellent problem-solving skills, communication skills, and ability to work effectively in a team environment
Skills: docker,ansible,aws,terraform,kubernetes