hackajob has partnered with a company providing highly efficient and intuitive centralized compliance resources for both our internal employees and our external clients.
Role: DevOps Engineer
Location: Austin, TX
Work setup: hybrid
Requirements
- A minimum of 7 years of experience in maintaining optimal performance of online production environments, utilizing bare metal, cloud, and container technologies.
- At least 4 years of experience managing production Kubernetes infrastructure, with exposure to cloud vendor Kubernetes solutions such as EKS, AKS, and GKE.
- Strong experience with Docker for containerization, including creating and managing Docker images and containers
- Strong experience in architecting and managing SaaS applications in Kubernetes, with specific experience in MLOps and LLMOps.
- Deep understanding of the machine learning lifecycle, including model training, deployment, monitoring, and scaling, particularly using AWS SageMaker.
- Experience with MLOps tools and frameworks, such as Kubeflow, MLflow or similar, and their integration into Kubernetes environments.
- Familiarity with LLMOps, including the deployment and management of LLMs in production environments. - Solid experience in scripting languages such as Python.
- Experience with Infrastructure deployment and automation tools such as Terraform, CloudFormation, etc.
- Working knowledge of industry-standard build tooling and CI/CD using GitHub & Github Actions
- Expertise in monitoring and logging solutions such as Prometheus and Grafana.
- Good understanding of networking and security concepts.
- Strong knowledge of Linux systems and shell scripting.
- Strong communication and collaboration skills, with experience working closely with data scientists and ML engineers.
- Experience working in an agile environment and understanding of agile methodologies.
- Certifications such as CKA (Certified Kubernetes Administrator) or CKAD (Certified Kubernetes Application Developer) are a plus
- Nice to Haves:
- Experience with workflow orchestration tools like Apache Airflow, particularly for managing complex data pipelines and ML workflows.
- Experience with GitOps tools such as ArgoCD, for managing Kubernetes deployments through version-controlled repositories.
- Familiarity with GPU acceleration technologies and their integration with Kubernetes for optimizing ML model training and inference.
- Knowledge of data versioning tools and frameworks like DVC (Data Version Control) in the context of MLOps.
- Experience with cloud cost optimization strategies, particularly in environments running intensive ML and AI workloads.
If you're interested in finding out more about this fantastic opportunity please get your application in and we can arrange a call.
hackajob is a recruitment platform that will match you with relevant roles based on your preferences and in order to be matched with the roles you need to create an account with us.
*This role requires you to be based in the US*