Job title : Devops Engineer
Location : Bangalore
Experince : 3- 12years
Notice period : 30 - 60days
DevOps & Cloud Infrastructure Engineer
To help us build robust and scalable systems that improve the customer experience, we’re looking for a DevOps engineer who can be responsible for developing and provisioning infrastructure, observability platform tools such as Prometheus, Grafana, and distributed logging and tracing stacks. The ideal candidate will have a background and familiarity with Shell Scripting Python, and will work with developers and engineers to ensure that infrastructure and observability practices and processes work as intended.
Objectives of this role
- Building and implementing new DevOps tools, Terraform modules
- Work to automate and improve development and release process
- Design and implement security controls at the infrastructure layer
- Automate release across environments including disaster recovery region
Responsibilities
- Deploy updates and fixes, and provide Level 2 technical support
- Build tools to reduce occurrence of errors and improve customer experience
- Develop software to integrate with internal back-end systems
- Design and implement distributed logging and tracing stack
- Develop scripts to automate metrics collection, operational dashboard
- Design procedures for system troubleshooting and maintenance
Required skills and qualifications
- Experience as a DevOps engineer or in a similar software engineering role
- Proficiency with Git version control system
- Good knowledge of Shell Scripting or Python
- Working knowledge of Terraform, databases and SQL
- Working knowledge of Prometheus, Grafana
- Problem-solving attitude and collaborative team spirit
Preferred skills and qualifications
- Bachelor degree in computer science, engineering, or relevant field
- Experience in civil engineering or customer experience
- Experience in developing/engineering applications for a large company
- Prometheus, PromQL expressions
- Grafana dashboards, PagerDuty, Jaegar (any)
- OpenTelemetry, OpenTracing (any)
- EasticSearch, LogStash, Kibana (ELK) stack big plus
- Micrometer, Loki, Google BigQuery logging (any)
- Automate failover/scale-up/scale-down
- Automate operational, perf testing, activities
- Load testing, chaos testing a plus
- K6/JMeter/ChaosToolkit/Gremlin
- Hands on AWS infrastructure-as-code
- One or more of Kubernetes, Helm, Ansible, Terraform