Job Description: Senior DevOps Engineer (On-Prem & Hybrid Infrastructure) Experience Required: 3–6 Years
Location - Gurugram, 5 days Work from office
We are looking for a highly skilled Senior DevOps Engineer with strong hands-on experience in on-premise infrastructure, Linux systems, automation, CI/CD, container platforms, observability, and platform engineering. The ideal candidate should have a deep understanding of enterprise grade DevOps tools and practices, especially in hybrid/on-prem environments.
Key Responsibilities
1. Core Infrastructure & Operating Systems
• Manage and administer Linux environments: RHEL/CentOS 7–9, Ubuntu 20.04/22.04, Oracle Linux, SUSE.
• Handle enterprise storage systems: NAS, NetApp, HPE 3PAR/Nimble. • Work with SAN/NAS protocols: FC, iSCSI, NFS, CIFS/SMB, including multipathing. • Perform bare-metal provisioning for servers and appliances.
2. Container Platforms (On-Prem Kubernetes)
• Strong hands-on expertise in Red Hat OpenShift (4.12–4.16).
• Good understanding of core Kubernetes concepts.
• Manage container runtimes like containerd or CRI-O.
• Work on cluster upgrades, operators, networking, storage classes, monitoring, and security hardening.
3. CI/CD & Automation
• Implement CI/CD pipelines using Jenkins (Pipeline-as-Code, shared libraries, distributed agents).
• Work with GitLab CE/EE, including runners and self-hosted workflow automation.
• Manage GitOps deployments via ArgoCD, Flux v2, or similar.
• Strong proficiency in Ansible (AWX/Tower).
• Infrastructure-as-Code using Terraform/OpenTofu.
• Build standard VM/bare-metal images using Packer (good-to-have).
4. Configuration Management
• Maintain automation using Ansible as the primary tool.
• Manage OS fleets through Red Hat Satellite or SUSE Manager.
• Exposure to Puppet or Chef (legacy systems support).
5. Monitoring, Logging & Observability
• Manage observability stack with Prometheus + Thanos/Cortex/Mimir for metrics.
• Build dashboards and alerts in Grafana / Grafana Enterprise.
• Experience with Loki + Promtail/Alloy for logs.
• Optional exposure: Elastic Stack (ELK/EFK), OpenSearch, Zabbix, Nagios.
• Deploy and maintain OpenTelemetry collectors.
• Experience with VictoriaMetrics is a plus.
6. Security, Compliance & Secrets Management
• Manage enterprise secrets using HashiCorp Vault (on-prem).
• Perform vulnerability and compliance scanning using OpenSCAP, Anchore, Trivy, Clair.
• Work with SELinux/AppArmor in enforcing mode.
• Handle certificate lifecycle with Venafi, internal CA, or cert-manager. • Support VA scans using tools like Nessus.
7. Networking (Enterprise-Grade On-Prem & Hybrid)
• Good understanding of cloud-style concepts: Security Groups, NACLs, NAT, IGW, Route Tables.
• Exposure to pfSense, FortiGate, Cisco firewalls.
• Experience with load balancers: F5 BIG-IP, NGINX, HAProxy, XAMPP.
• Understanding of BGP/EVPN, VLANs, VXLAN, firewalls, and enterprise network segmentation.
8. Backup, Disaster Recovery & High Availability
• Manage backups for DBs, VMs, and applications.
• Understanding of stretched clusters, metro clusters, and Site Recovery Manager (SRM).
• Practical knowledge of designing and supporting HA vs DR for servers, databases, and applications.
9. Databases & Middleware (DevOps-Managed Environments)
• Hands-on experience managing:
o PostgreSQL, MySQL/MariaDB, MongoDB, Oracle, MS SQL Server
o Redis, RabbitMQ, Kafka clusters
• Experience with operators like:
o CrunchyData (Postgres)
o Percona (MySQL/Mongo)
o StackGres (Postgres)
o Strimzi (Kafka)
10. Scripting & Platform Engineering
• Strong scripting using Python, Go (for internal tooling), and advanced Bash.
• In-depth understanding of Linux internals: cgroups, namespaces, kernel tuning.
• Work with Git workflows: GitOps, trunk-based development, monorepos.
• Capacity planning, infrastructure right-sizing, and on-prem cost optimization.
• Familiarity with:
o Redis, Kafka
o Jupyter, Airflow
o Jenkins (deep expertise)
o Zenduty, SonarQube
o Docker RBAC, Teleport, SSO
• Strong platform engineering mindset to build internal automation platforms.
Required Qualifications
• 3–6 years of experience in DevOps, Site Reliability Engineering, or Platform Engineering.
• Strong hands-on experience with Linux and enterprise on-prem infrastructure.
• Experience managing HA production systems at scale.
• Strong problem-solving skills and ability to work independently in complex environments.
Good to Have
• Certifications in RHEL, OpenShift, Kubernetes, Terraform, Ansible, or security tools.
• Experience in regulated industries (BFSI, fintech, telco).
Why Join Us
• Opportunity to work on enterprise-grade infrastructure in an advanced hybrid environment.
• Exposure to full-stack DevOps, including OS, platform, containers, networking, and automation.
• High ownership role with modern tooling and platform engineering responsibilities.
Apply - sineha@hirojet.com