Emerald Resource Group
Sr DevOps Engineer
Salary: Up to 110k for the right background
Hybrid: In Ritchfield, Ohio
Only US Citizens will be considered
About the Role
We are seeking a DevOps Engineer to scale our high velocity payments platform and manage transactions within our multi-tenant AWS environment. You will dive into our established architecture to improve automation and system resiliency while solving live production challenges. This role requires a practical engineer to optimize processing engines and maintain compliance as our merchant volume grows.
Responsibilities
- Ensure the reliability, availability, and performance of a multi-tenant production system
- Scale and operate AWS-based infrastructure supporting a Java web application
- Monitor and troubleshoot issues across application, database, cache, and data warehouse layers
- Improve observability through metrics, logging, and alerting
- Participate in on-call rotations and lead incident response and root cause analysis
- Identify performance bottlenecks and scaling limits in a shared-tenant environment
- Automate operational tasks and reduce toil where it matters most
- Work within existing frameworks and tooling to make systems safer and more scalable
- Partner with developers to improve deployments, capacity planning, and failure handling
- Implement automated load and fuzz testing
- Define key service level objectives (SLO)
Technologies You’ll Work With
- AWS (EC2, ECS, RDS, ElastiCache, Redshift, and related services)
- Java-based web applications
- MySQL (performance tuning, scaling, reliability)
- Amazon ElastiCache (Redis/Memcached)
- Amazon Redshift
- Monitoring and alerting tools (Graphite, Grafana, Cloudwatch)
Qualifications
- 3+ years of experience in SRE, DevOps, or production operations roles
- Strong understanding of AWS infrastructure and cloud-native scaling patterns
- Experience supporting Java applications in production
- Solid knowledge of MySQL performance, replication, and scaling strategies
- Experience operating cache layers and data stores at scale
- Understanding of multi-tenant architectures, including isolation, noisy-neighbor issues, and capacity planning
- Strong Linux fundamentals and troubleshooting skills
- Ability to stay calm, think clearly, and prioritize during incidents