DevOps/Infrastructure Engineer
The DevOps/Infrastructure Engineer builds and maintains cloud infrastructure and deployment pipelines for a generative AI platform. This role is responsible for environment provisioning, container orchestration, CI/CD automation, and infrastructure scaling to support a microservice-based architecture. The DevOps/Infrastructure Engineer takes a broad perspective to infrastructure problems and exercises independent judgment in selecting techniques and evaluation criteria to obtain results.
Key Responsibilities
Infrastructure & Cloud Architecture
- Provision and manage AWS cloud environments across development, staging, and
production tiers
- Deploy and operate containerized services and tool servers within a microservice
architecture
- Manage container orchestration with blue-green and canary deployment strategies
- Support infrastructure for platform dependencies including search engines, caching
layers, and relational databases
- Exercise independent judgment in evaluating and selecting infrastructure
approaches and tooling
- Build and manage infrastructure on AWS, leveraging services such as ECS, ECR,
IAM, VPC, S3, and CloudWatch
DevOps & CI/CD
- Maintain and extend CI/CD pipelines using shared library patterns for service builds,
testing, and deployment across a microservice fleet
- Collaborate with developers on container optimization, startup configuration, and
environment management
- Identify gaps between system components and designs and deliver solutions that
enable team autonomy
- Develop actionable insights from analyzing infrastructure trends and DevOps best
practices, communicating recommendations to management
Monitoring, Scaling & Reliability
- Implement infrastructure monitoring, alerting, and scaling strategies for AI
workloads and supporting services
- Ensure platform reliability through capacity planning, performance testing, and
incident response processes
- Optimize resource utilization and cost across the cloud environment
Quality & Testing
- Implement and maintain automated infrastructure testing including smoke tests,
health checks, and deployment verification
- Design and execute load testing and performance benchmarking for platform
services and AI workloads
- Ensure CI/CD pipelines enforce quality gates including linting, security scanning,
and test execution before deployment
- Validate infrastructure-as-code changes through automated testing and peer review processes
Mentorship & Collaboration
- Guide team members in infrastructure best practices, deployment patterns, and
operational procedures
- Partner within and across teams to meet shared goals and priorities around
platform stability and delivery velocity
- Champion collaborative resolution of infrastructure issues and contribute to
internal process improvement initiatives
Security & Compliance
- Assist with adherence to technology policies and comply with all security controls
- Implement secure coding practices, particularly in handling personally identifiable
information (PII) and sensitive regulatory data
- Participate in threat modeling and security discussions for API and infrastructure
components
- Understand and apply FINRA's security standards and best practices for regulated
financial environments