DevOps Engineer

Grupo Galvão • Full-time • Austin, TX, US • 3h ago

About The Role

Teza is a systematic trading firm building quantitative strategies across multiple asset

classes. We are looking for a DevOps Engineer to own and evolve our infrastructure

platform — the systems that our quants, developers, and traders rely on every day.

Our infrastructure team is at an inflection point. We have a working platform that

supports active trading and research, and we are now investing in making it more robust,

observable, and developer-friendly. The successful candidate will have significant

influence over the direction of our infrastructure — shaping tooling choices, establishing

standards, and building systems that scale with the firm.

Location

Austin, Texas or Yerevan, Armenia

What We're Building Toward

As a team, we are working toward the following goals. You will play a central role in

defining the approach and driving execution:

Full-stack observability** across all internal services — unified metrics, centralized logging, distributed tracing, and actionable alerting — so that engineers and traders have clear, real-time visibility into system health and performance.
Reliable, self-service compute orchestration** spanning Slurm (HPC/ML workloads), Hadoop (batch data processing), and Airflow (workflow scheduling) — enabling researchers and data engineers to run workloads at scale without infrastructure bottlenecks.
Mature secrets management** for trading credentials, API keys, certificates, and service-to-service authentication — with rotation policies, auditing, and tight integration into deployment workflows.
Unified release pipelines** that bring consistency to how diverse applications — trading strategies, data pipelines, and real-time trading systems — move from development to production, each with their own build, test, and deployment needs.
A well-maintained platform foundation** where shared services — identity management, GitHub Actions runners, VPN, observability tooling — stay current and reliable without disrupting active trading.
Strong security posture** across production and research environments — network segmentation, access controls, vulnerability management, and compliance — that evolves alongside the platform rather than being bolted on after the fact.

Requirements

Experience building and operating internal platform services for development teams (CI/CD, compute, monitoring, developer tooling) — not just consuming them.
Strong proficiency in Linux systems administration and container orchestration (Docker, Kubernetes, or similar).
Deep understanding of the software development lifecycle and how infrastructure supports engineering teams — from local development through CI to production deployment.
Familiarity with ML and data-intensive workflow requirements: GPU scheduling, large dataset access patterns, experiment tracking, and reproducible compute environments.
Proficiency with at least one major cloud provider (AWS, GCP, or Azure) including networking, IAM, and managed services.
Experience designing and operating hybrid infrastructure — cloud, on-premises, and colocation environments — with an understanding of the tradeoffs between them.
Hands-on programming ability in Python or another scripting language, sufficient to build tooling, automation, and infrastructure-as-code — not just run playbooks.
Solid understanding of core network protocols and services: DNS, LDAP, SMTP, TLS, HTTP, and SSH.
Practical knowledge of infrastructure security: firewall management, access control models (zero-trust, bastion hosts), vulnerability scanning, patch management, and audit logging.

Nice to Have

Experience in a trading firm or other environment with strict uptime and latency requirements.
Familiarity with infrastructure-as-code tools (Terraform, Pulumi, Ansible).
Experience with log aggregation and SIEM systems.
Understanding of compliance frameworks relevant to financial services.

Benefits