Jobs search

DevOps Lead – Chandigarh

Simplify • Full-time • Chandigarh, IN • 4d ago

Job description

Location: Onsite - Chandigarh

Employment Type: Full-Time

Experience: 5+ Years

Department: Engineering

About SimplifyAI

SimplifyAI is a fast-growing AI-first startup building intelligent solutions across Cloud, Data, and Generative AI. We help enterprises automate workflows, unlock data potential, and accelerate digital transformation using cutting-edge AI. With a lean, high-ownership engineering culture and offices in Chandigarh, India and Jakarta, Indonesia, we move fast, think big, and build things that matter.

About the Role

We are seeking a highly skilled and self-driven DevOps Lead with a minimum of 5 years of hands-on experience to strengthen our Engineering team. In this role, you will own the full lifecycle of our cloud infrastructure — from provisioning and automation to monitoring and incident response. You will be a critical bridge between development and operations, ensuring our systems are resilient, secure, and ready to scale.

This is an on-site role requiring strong collaboration with cross-functional teams including backend engineers, QA, and security.

Key Responsibilities

Infrastructure & Cloud

Architect, provision, and manage production-grade infrastructure on AWS / GCP / Azure

Design highly available and fault-tolerant systems using cloud-native services

Manage networking components: VPCs, subnets, route tables, security groups, NAT gateways, VPNs

Oversee DNS management, SSL/TLS certificates, load balancers, and CDN configurations

Drive cloud cost optimization initiatives and enforce resource governance policies

Automation & CI/CD

Build, maintain, and continuously improve CI/CD pipelines (GitHub Actions, GitLab CI, Jenkins, or CircleCI)

Automate infrastructure provisioning and configuration using Terraform, Ansible, or Pulumi

Implement GitOps workflows and environment promotion strategies (dev → staging → production)

Automate repetitive operational tasks through scripting (Bash, Python, or Go)

Containers & Orchestration

Manage containerized workloads using Docker and Docker Compose

Administer Kubernetes clusters — deployments, services, ingress controllers, HPA, resource quotas, and RBAC

Manage Helm charts for standardized application packaging and deployment

Monitoring & Observability

Set up and maintain end-to-end observability using Prometheus, Grafana, Datadog, or equivalent

Implement structured log aggregation using ELK Stack or Loki + Grafana

Configure distributed tracing with OpenTelemetry, Jaeger, or Zipkin

Define SLOs/SLAs, alerting thresholds, and on-call escalation runbooks

Security & Compliance

Champion DevSecOps practices across the SDLC

Manage secrets using HashiCorp Vault, AWS Secrets Manager, or Doppler

Enforce network policies, pod security standards, and least-privilege IAM roles

Conduct regular vulnerability scanning (Trivy, Snyk) and coordinate remediation

Ensure infrastructure compliance with security standards (SOC 2, ISO 27001 awareness)

Incident Management

Lead production incident response, perform thorough Root Cause Analysis (RCA), and drive post-mortems

Define and improve disaster recovery (DR) and business continuity plans

Establish and test backup and restore procedures for critical databases and services

Collaboration & Documentation

Work closely with developers to implement deployment strategies: blue/green, canary, and rolling updates

Maintain up-to-date runbooks, architecture diagrams, and infrastructure documentation

Mentor junior engineers on DevOps practices and cloud fundamentals

Required Skills & Qualifications

Cloud & Infrastructure

Proficiency in IaC tools — Terraform (mandatory), Ansible, or Pulumi

Strong understanding of VPC design, multi-region architectures, and cloud networking

Familiarity with serverless (AWS Lambda / Cloud Functions) and managed services (RDS, ElastiCache, S3)

Monitoring & Observability

Hands-on experience with Prometheus + Grafana dashboards and alerting

Log management using ELK Stack or Loki

Ability to define meaningful SLIs, SLOs, and error budgets

Databases & Messaging

Production experience with PostgreSQL (replication, backup, query optimization)

Hands-on with Redis (clustering, persistence, eviction policies)

Familiarity with message brokers: RabbitMQ, Kafka, or Celery + Redis

Education

Bachelor's degree in Computer Science, Information Technology, or a related field

Relevant certifications preferred: AWS Solutions Architect, CKA (Certified Kubernetes Administrator), HashiCorp Terraform Associate

Soft Skills

Strong ownership mindset — treats production systems as a personal responsibility

Excellent analytical and root-cause-oriented problem-solving skills

Clear and concise communicator, both written and verbal

Comfortable with ambiguity and able to prioritize independently in a fast-paced environment

Team-first attitude with the ability to mentor and uplift peers

Disciplined about documentation and knowledge sharing

What We Offer

Competitive salary benchmarked to market standards

On-site work culture with a collaborative, high-performance engineering team

Access to the latest tooling, cloud credits, and hardware

Dedicated learning & development budget (certifications, courses, conferences)

Related Jobs

DevOps Engineer

HRS Group • Full-time • Chandigarh, IN • 3d ago

AWS DevOps Engineer

3d ago

Apply