Company
Software is eating the world - and AI is eating software. Just like the last two decades have seen software penetrate every industry and use case - the next two decades will see AI touch everything we do. And as more and more of the mission-critical systems that run our lives are entrusted to algorithms, reliability and safety become a real concern. This is where CalypsoAI comes in - we build tools for trusted AI.
This role combines the immediate responsibilities of a skilled Technical Support Engineer within a SaaS (Software as a Service) environment with a clear path for professional development toward a Site Reliability Engineering (SRE) focus. The ideal candidate possesses a strong technical foundation, thrives in a fast-paced troubleshooting environment, and is passionate about automating processes to ensure exceptional customer experience.
Proactive Monitoring and Uptime Assurance:
- Monitor key SaaS application metrics, logs, and alerts to identify and preempt
- potential service disruptions.
- Maintain a 24/7 support model ensuring high availability and performance of
- the SaaS environment.
Customer-Centric Incident Response:
- Serve as a primary point of contact for technical customer inquiries and issues.
- Collaborate with customers to understand their needs, troubleshoot problems effectively, and provide clear explanations.
- Triage, escalate, and own the resolution of technical incidents.
Development Team Collaboration:
- Analyze metrics, logs, and incident reports to provide critical insights to the development team for troubleshooting and continuous improvement.
- Drive efficiency by identifying patterns and areas for automation.
Site Reliability Engineering (SRE) Development:
- Propose and implement process automation to streamline incident resolution and improve overall system reliability.
- Introduce SRE concepts and best practices as the team matures, enhancing monitoring, alerting, and self-healing capabilities.
Qualifications
- 2-4 years of experience in a technical support or systems administration role
- Bachelor's degree in Computer Science, Information Technology, or a related field or equivalent experience
- In-depth understanding of SaaS environments and cloud-based architecture (Preferably AWS)
- Proficiency in a scripting language (e.g., Python)
- Strong understanding of networking, databases (Postgresql), and operating systems (Linux)
- Solid Understanding of web technologies (HTTP, REST APIs, JSON, etc.)
- Excellent problem-solving, analytical, and communication skills, both written and verbal
- Ability to work independently and collaboratively within a team
- Willingness to learn new technologies and adapt to changing requirements
- Experience working with a ticketing system
Bonus
- Knowledge of monitoring tools (e.g., Prometheus, Grafana)
- Experience with Kubernetes
- Experience with configuration management tools (e.g. Terraform)
- Familiarity with cloud infrastructure and technologies
- Previous exposure to SRE principles and practices
Why CalypsoAI?
- To start with: we at CalypsoAI are building the first-ever security for AI solution!
- If that's not enough: We are a fast-growing startup meaning we have great opportunities to grow within the company!
- Endless opportunities to learn from our inspirational, talented team members
- You will never be bored at work again if you join us!