About AEREO
AEREO (earlier known as Aarav Unmanned Systems) is India’s leading drone solution tech startup in the commercial segment. We provide end to end solutions to government and private enterprises in the field of mining & metals, urban planning, large infrastructure, irrigation, agriculture and energy. We are early starters and market leaders in the Indian drone industry. Our belief is to solve real problems and use drone technology as a revolution. Our strength is our perseverance, clarity, collaborative approach, innovation and our team.
We are funded by some of the well-known Indian VCs in our growth journey so far. However, our business is already self-sustaining and growing at a fast pace. We love machines, especially aerial robots and believe that drones are shaping the future of the world. Aereo is actively looking for self-driven and process-oriented individuals who would be interested in joining team Aereo in this fascinating growth journey and be an early contributor to the drone ecosystem of the country which is growing at a very exciting and fast pace.
The role pertains to the Platform Team in Aereo whose charter is to build and maintain our cloud-based enterprise SaaS platform, Aereo Cloud. Aereo Cloud is a powerful platform which enables organizations to store, manage, visualize and analyze their drone-based geospatial data at PB-scale and generate critical and actionable business insights based on this data.
We are looking for an experienced DevOps leader who will help in taking Aereo Cloud to the next level. This person will lead a key set of initiatives that needs working closely with the customers, product and business stakeholders to improve the end user experience, constantly strive towards excellence to establish Aereo Cloud as the best cloud platform for drone-based GIS data. Apart from this, he/she would oversee the complete infrastructure to deliver the best integrated product experience for our Aereo Cloud users, own & drive the engineering/technical strategy for the team to not just meet product requirements but help it scale for next 2-5 years by having futuristic lens.
What You’ll Do
- Understand the vision and the bigger picture of Aereo Cloud and ensure the team fully understands and appreciates how their work fits into the larger scheme of things.
- Lead reliability engineering projects and drive it to closure.
- Write code and perform code reviews for best practices and code quality.
- Contribute to the design/architecture of the system.
- Automate processes and find opportunities to improve observability and availability of the Platform and reduce toil.
- Supervise a team of DevOps Engineers, ensuring production applications are stable, reliable, and well documented.
- Own end to end availability and performance of mission critical services.
- Analyze and debug complex issues across tiers from frontend to mid-tier to infrastructure.
- Practice sustainable incident response and blameless RCAs and postmortems.
- Grow and develop teams, drive conversations with the SREs on topics such as career development and align their growth with the long-term vision and wider business needs.
- Participate in the end-to-end recruiting process, hiring & on-boarding exceptional SRE talent.
- Define, measure & own key metrics for the performance of your and your team’s functional areas.
Preferred Qualifications
- More than 12 years of experience handling systems for large scale production environments and building DevOps/SRE teams.
- A self-starter, able to build, drive and advocate for SRE solution.
- Effective cross-functional collaboration skills to develop tools for secured, scalable, and reliable systems.
- Solid understanding of SRE concepts like SLAs, SLOs, SLIs, error budgets, MTTR, MTTD, etc.
- Experience with variety of tools that help manage, understand, and debug large, complex distributed systems.
- Good programming experience (Python/Go).
- Hands-on experience with Kubernetes and Docker.
- Working knowledge in any one of the cloud platforms (AWS, Azure, GCP)
- Experience with monitoring and logging tools (e.g. Datadog, ELK, Prometheus, Grafana).
- Good knowledge of Unix system, networking, web technologies, and databases.
- Expert with troubleshooting issues and bugs.
- Experienced with writing, deploying and debugging Terraform scripts.
- Incident Management experience coupled with effective communication skills.