Active Security Clearance W/ Polygraph Required
We are seeking a highly motivated DevOps Engineer to support the Report Authoring and Dissemination (RAD) modernization effort. This initiative focuses on converging multiple authoring, dissemination, and supporting capabilities into a modernized, scalable, and highly available platform supporting mission-critical operations.
The selected candidate will play a key role in designing, deploying, automating, and maintaining complex distributed systems and applications across cloud and virtualized environments. This position requires a strong understanding of infrastructure automation, containerized application deployment, CI/CD pipelines, system administration, and troubleshooting within Linux-based environments.
The ideal candidate thrives in fast-paced operational environments, enjoys solving complex technical challenges, and is passionate about improving system reliability, scalability, and performance through automation and DevOps best practices.
Key Responsibilities
- Deploy, configure, and maintain mission-critical applications supporting the RAD platform
- Design, implement, and maintain CI/CD pipelines to support rapid and reliable software delivery
- Provision and manage cloud and virtual infrastructure using Infrastructure-as-Code (IaC) methodologies
- Develop and maintain Terraform and Ansible automation scripts for infrastructure deployment, configuration management, and operational support
- Configure, manage, and troubleshoot Linux-based servers and environments
- Monitor system health, performance, and availability to ensure operational requirements are consistently met
- Investigate, diagnose, and resolve application, infrastructure, networking, and deployment issues
- Support containerized application deployments using Docker Swarm, Docker Compose, and related technologies
- Collaborate closely with software developers, system administrators, and architects to support application modernization initiatives
- Implement security best practices, patch management, and system hardening procedures
- Support distributed web applications and microservice-based architectures
- Participate in system upgrades, release management activities, and environment maintenance
- Create and maintain technical documentation, deployment procedures, and operational runbooks
- Provide on-call support through a rotating pager schedule to ensure timely response to production incidents and operational emergencies
Multiple Levels Of Performance
$117,000 - $233,000
Requirements
Required Qualifications
- Experience administering and supporting Linux operating systems in enterprise environments
- Hands-on experience with AWS cloud services, virtual machine management, and hybrid infrastructure environments
- Strong experience with Docker container technologies, including Docker Swarm and Docker Compose
- Experience deploying and maintaining distributed web applications
- Experience developing and maintaining CI/CD pipelines utilizing tools such as GitLab CI/CD
- Proficiency with Infrastructure-as-Code tools, including Terraform
- Experience developing and maintaining Ansible playbooks, roles, and automation workflows
- Strong understanding of Git source control and branching strategies
- Experience troubleshooting production systems, application deployments, and infrastructure issues
- Ability to work independently while collaborating effectively within a cross-functional team
Desired Qualifications
- Experience developing automation and utility scripts using Python
- Experience administering and configuring Apache NiFi data flow environments
- Experience supporting and maintaining Elasticsearch clusters
- Familiarity with Java and Spring Boot applications, including application log analysis and troubleshooting
- Experience configuring and supporting HAProxy load balancing solutions
- Experience administering MongoDB databases
- Familiarity with distributed system architectures and large-scale enterprise applications
- Experience supporting modernization or cloud migration initiatives
Operational Requirements
- Candidates must participate in a rotating 24/7 on-call support schedule
- On-call duty is typically one week every 4-6 weeks
- Engineers are compensated for an additional four hours during their assigned on-call week, in addition to any hours worked responding to incidents
- Ability to respond to and troubleshoot critical system issues during off-hours when required
Preferred Attributes
- Strong analytical and problem-solving skills
- Excellent verbal and written communication abilities
- Ability to work in a highly collaborative Agile environment
- Self-starter capable of managing competing priorities and delivering results
- Passion for automation, continuous improvement, and operational excellence
Impact
This role provides an opportunity to directly influence the modernization of mission-critical reporting and dissemination capabilities by implementing scalable infrastructure, improving deployment automation, and enhancing overall system reliability for operational users.