Role: DevOps Engineer
Location: Princeton NJ
Work Mode: Hybrid (2 days a week)
Experience: Min 3 Years
Role Overview
As a Senior DevOps Engineer, you will be responsible for the end-to-end lifecycle of Kubernetes-based applications, from onboarding and CI/CD integration to production readiness, observability, and operational excellence. Strong communication skills, technical ownership, and the ability to operate in a fast-paced, regulated environment are essential.
Key Responsibilities
- Own end-to-end Kubernetes application onboarding across on-prem and cloud environments
- Build and maintain Jenkins CI/CD pipelines for container image builds and deployments across multiple image registries
- Implement authentication and authorization workflows using Keycloak and AWS IAM across EKS and on-prem Kubernetes clusters
- Design and operate Kafka clusters, including monitoring, scaling, and operational support
- Implement observability, monitoring, and alerting solutions for container networking and storage
- Conduct production-readiness reviews and maintain onboarding checklists for Kubernetes workloads
- Automate operational procedures including disaster recovery, stress testing, and environment provisioning using Python
- Troubleshoot production incidents, perform root cause analysis, document issues, and implement preventive measures
- Apply Site Reliability Engineering (SRE) principles, including SLIs, error budgets, alerting strategies, incident reviews, and participation in an on-call rotation
Qualifications
- 3+ years of experience building Linux containers and working with container orchestration platforms
- 3+ years of experience authoring and managing Kubernetes manifests, including Helm Charts and Kustomize
- Strong experience with Kubernetes networking, including ingress controllers (e.g., NGINX), service mesh, and container networking
- Solid understanding of TLS, certificates, and secure communications
- Experience implementing GitOps workflows using tools such as Flux or Argo CD
- Proficiency in Shell scripting and Python; Go experience is a plus
- Strong Linux systems background and deep understanding of containerization principles
- Experience with observability tooling, including Prometheus, Alertmanager, Grafana, and Splunk
- Hands-on experience managing Kafka clusters (required)
- Experience with Jenkins or comparable CI/CD frameworks
- Familiarity with web services, APIs, REST/RPC, HTTPS APIs, and service discovery mechanisms
- Experience with AWS services, including EKS, EC2, Load Balancers, VPCs, S3, RDS, DynamoDB, Network Firewalls, and ECS
- Experience building observability using CloudWatch Logs and Alarms
- Experience with Infrastructure as Code, preferably AWS CDK (Python or Node.js)
Preferred Skills
- Programming experience in Java, Groovy, Node.js, or React
- Experience managing identity and access management lifecycles, particularly with Keycloak and AWS IAM
- Background supporting systems in financial services or other regulated industries
Education & Experience
- Bachelor’s degree in Computer Science, Engineering, or a related field (or equivalent practical experience)
- 3+ years of experience using cloud technologies to support large-scale business applications