Position Overview
We are looking for a skilled DevOps & Site Reliability Engineer (SRE) to join our blockchain engineering team. This hybrid role blends DevOps principles with SRE practices to ensure our blockchain systems are reliable, scalable, and efficient. You will own the full lifecycle of infrastructure — from design and automation to monitoring and optimization — while driving resilience, performance, and developer productivity.
Key Responsibilities
- Design, deploy, and maintain scalable, secure, and highly available infrastructure for blockchain services.
- Manage and optimize AWS services (EC2, RDS, S3, EKS, ECR) and support workloads on Cloud Server.
- Build and maintain robust CI/CD pipelines using Jenkins and Git for automated delivery.
- Manage deployment of core services including Linux, Nginx, Redis, MySQL, ensuring reliability and performance.
- Streamline software release cycles with zero-downtime deployments.
- Establish monitoring, alerting, and logging frameworks (Prometheus, Grafana, ELK, CloudWatch, GCP Monitoring).
- Proactively identify, diagnose, and resolve performance bottlenecks, availability risks, and incidents.
- Lead post-incident reviews and drive improvements to eliminate recurring issues.
- Partner with blockchain engineers and developers to improve developer experience and infrastructure reliability.
- Drive automation initiatives to reduce toil and manual operations.
Requirements
- Bachelor degree or above, more than 3 years related experience, majoring in computer science, engineering, Information Systems or other related majors.
- Strong experience with Linux administration, networking, and system performance tuning.
- Hands-on experience with AWS (EC2, RDS, S3, EKS, ECR); exposure to Google Cloud.
- Proficiency in Jenkins, Nginx, Redis, MySQL, Git and other deployment/optimization tools.
- Practical knowledge of infrastructure-as-code.
- Familiarity with monitoring/observability tools.
- Experience in distributed systems, scaling, and high-availability environments.
- Familiarity with DevSecOps practices and compliance frameworks.
- Ownership mindset with strong problem-solving ability.
- Good communication skills and teamwork spirit.
- Proficiency in English and Chinese for seamless stakeholder communication.