We’re hiring a Cloud Site Reliability Engineer (SRE) to support a high-impact infrastructure transition focused on disaster recovery (DR) and data center migration. This is a hands-on, execution-driven role where you’ll help ensure critical systems remain resilient, recoverable, and fully operational during a major transformation.
What You'll Be Doing:
- Support disaster recovery migrations and infrastructure transitions
- Plan and execute DR testing, failover validation, and recovery readiness
- Help migrate and operationalize backup and recovery environments
- Work closely with cloud, infrastructure, and application teams to support system reliability
- Execute runbooks and support day-to-day DR and backup operations
- Identify risks, gaps, and improvements in recovery processes
- Maintain documentation, test results, and operational procedures
What We're Looking For:
- Experience in Site Reliability Engineering, Cloud Operations, or Infrastructure roles
- Hands-on experience with disaster recovery environments and backup systems
- Strong understanding of: DR concepts (RTO/RPO, failover/failback) and system resiliency/recovery testing
- Experience working in cloud or hybrid environments (AWS, Azure, etc.)
- Ability to support time-sensitive, high-priority infrastructure initiatives