We’re looking for a talented DevOps Engineer to join a fast-growing AI platform company that’s shaping the future of intelligent systems. In this role, you’ll work on mission-critical infrastructure, ensuring reliability, scalability, and high performance across global operations.
🔹 What You’ll Do:
- Manage and maintain container clusters (Kubernetes, Docker) and distributed systems (Kafka, Redis, Elasticsearch)
- Design and enhance infrastructure operation platforms, CI/CD pipelines, monitoring, and logging systems
- Implement high availability and reliability engineering practices with SLA/SLO frameworks
- Automate operations, build self-service tools, and drive platform standardisation
- Continuously improve architecture, deployment strategies, and operational processes
🔹 What We’re Looking For:
- 2+ years of experience in Systems Operations, DevOps, or Site Reliability Engineering
- Strong knowledge of cloud platforms (AWS, Azure, or GCP) and distributed systems
- Proficiency in scripting and automation (Shell, Python)
- Expertise with Kubernetes, Docker, CI/CD pipelines, and infrastructure monitoring tools
- Familiarity with Nginx, MySQL, Redis, Kafka, and Elasticsearch
- Bonus: Experience with Service Mesh, Cilium CNI, eBPF, and cloud-native networking best practices