Flexibility and efficiency team :
- If you’re looking for a fast-paced, mission-driven organization with endless opportunities to learn and excel, then usis the place for you.
- We are looking for a Middle Database Administrator to help us push our infrastructure and configuration management to the next level.
- You'll have experience designing, implementing, and maintaining large-scale, complex, highly critical applications.
Working at us:
- Work alongside diverse, world-class talent in an environment where learning and growth opportunities are endless
- Tackle fast-paced, challenging, and unique projects
- Work in a truly global organization, with international teams and a flat organizational structure
- Competitive salary and benefits
- Balance life and work with flexible working hours and casual work attire
Responsibilities:
- Deeply understand the business, responsible for high availability governance of financial services, and continuously improve business SLA.
- Build and improve observability platforms/tools, enhance monitoring efficiency, and shorten fault location and recovery time.
- 7 * 24-hour on call, responsible for responding and handling major troubleshooting issues in the production environment in a timely manner, while organizing relevant product/R&D/operations/infrastructure teams to jointly investigate and solve problems. Responsible for the MTTR of fault response time and fault resolution time.
- Continuously maintain and improve the stability and performance of the system, as well as undertake daily needs, to enhance team work efficiency.
- Guide the basic work of SRE to evolve towards automation, platformization, and intelligence, and improve the overall operation and maintenance management efficiency of various component systems in the infrastructure.
- Accumulate operational best practices, provide guidance for business architecture design and component selection, and output operational technical documents.
- Write relevant documents and regularly share technical and management achievements with all staff.
- Other related work.
Requirements:
- Bachelor's degree in computer science, more than 10 years of experience in system operation and maintenance/SRE in large and medium-sized Internet/financial industry, and 5 years of experience in message middleware/cache/k8s/database production environment maintenance.
- Familiar with Linux operating system and Shell programming, proficient in 1-2 programming languages in Golang, Java, and Python.
- Familiar with the basic principles of networking, familiar with TCP/UDP networks, HTTP, Socket, CDN and other technologies.
- Proficient in load balancing, microservice architecture, and operations architecture, such as Nginx, LVS, Redis, Kafka, Elasticsearch, and other common middleware working principles, deployment, and usage.
- Familiar with various monitoring tools, including but not limited to skywalking zipkin、pinpoint、Prometheus、Grafana、zabbix。 Practical experience in APM or observability is preferred.
- Understand the technical principles of Docker/k8s container platform and be familiar with the use of the Rancher platform.
- Good team communication, subjective initiative/driving force, sense of responsibility, strong logical thinking, data analysis and problem-solving skills.
Preferred Qualifications:
- Experience in remote project collaboration across regions is preferred.
- Experience in technical roles within securities, futures companies, or blockchain industries.
- Experience in developing complete automated operation and maintenance tools is preferred.
Job Highlights:
- High availability management of the company's key fintech business line.
- Continuously promote high availability management through incident operations, quality operations, and risk operations, improving business SLA.
- Construction and refinement of automated operation and maintenance systems and operation systems, continuously enhancing efficiency.
Technology Stack :
DB:
mysql pgsql elasticsearch redis mongodb etcd OceanBase CickHouse
Middle components:
nacos kafka zookeeper rabbitmq rocketmq apisix nginx
Cloud Native :
kubernetes rancher docker Prometheus grafana
DataCenter:
nas ceph PVE OpenStack
Network/Load Balance :
CF-CDN haproxy frp openvpn-as apisix
CI/CD:
confluence/JIRA/gitlab/harbor
Programs Languages :
go java python PHP