Who Are We?
M2P Fintech is a leading Banking Technology Platform, shaping the future of digital finance. Established in 2014 and headquartered in Chennai, we have a strong footprint across 30+ countries in Asia-Pacific, the Middle East & Africa (MEA), and Oceania.
At the forefront of next gen fintech, we drive innovation in banking, lending, and payments infrastructure. Powering global fintech transactions, our comprehensive technology stack enables seamless banking solutions, lending platforms, Buy Now Pay Later (BNPL) services, customized credit cards, prepaid cards, and more.
About The Role
M2P Fintech is looking for a highly skilled Lead DevOps Engineer (AI-Native Infrastructure & Platform Engineering) with deep expertise in Multi cloud infrastructure, automation, AI infrastructure operations, and modern DevOps/SRE practices.
This role goes beyond traditional DevOps and requires a seasoned specialist capable of building and operating AI-ready infrastructure platforms that support high-throughput APIs, LLM/AI workloads, GPU/CPU-based compute, data-intensive systems, real-time inference pipelines.
You will be responsible for architecting, automating, securing, and optimizing highly scalable and cost-efficient cloud environments that enable high-velocity engineering and AI teams. This is an ideal position for someone who combines technical ownership, an automation-first mindset, and a passion for developer productivity and platform reliability.
What Will You Do In This Role
Cloud Infrastructure & Platform Engineering (MultiCloud)
- Architect, deploy, and manage highly scalable and secure infrastructure on Mulitcloud ( AWS, GCP, Azure, OCI ). Design cloud platforms supporting Payments infra, data pipelines, real-time APIs, and high-concurrency backend systems.
- Hands-on expertise with key Multicloud services including VM, ECS/EKS/GKE/AKE/OKE, Lambda, RDS, DynamoDB, S3, VPC, CloudFront, IAM, CloudWatch, and GPU-enabled instances.
- Build and maintain Infrastructure-as-Code (IaC) using Terraform, CloudFormation
Design multi-AZ and multi-region architectures for high availability and disaster recovery (HA/DR).
- Build reusable platform templates and shared infrastructure modules.
CI/CD, Automation & Developer Productivity
- Build and maintain CI/CD pipelines using GitHub Actions, GitLab CI, Jenkins, or AWS CodePipeline.
- Automate deployments, environment provisioning, and release workflows.
Build self-service developer platforms, preview environments, and reusable deployment workflows to improve developer productivity.
- Implement automated patching, scaling, backups, cleanup workflows, and drift detection.
Containers, Kubernetes & Platform Reliability
- Manage Docker-based environments, containerized applications, and optimize workloads using Kubernetes Multicloud K8 engines
- Manage autoscaling, cluster health, node pools, ingress, service mesh, and workload isolation.
- Optimize infrastructure for performance, resilience, and cost-efficiency.
Implement progressive deployment strategies including blue/green, canary, and rolling deployments.
Observability, Incident Response & SRE Practices
- Implement observability stacks using CloudWatch, Prometheus, Grafana, ELK, Datadog, OpenTelemetry, or New Relic.
- Build actionable dashboards and intelligent alerting systems while defining and tracking SLIs, SLOs, and SLAs.
- Lead incident response, root cause analysis, and blameless postmortems to reduce operational toil and improve MTTR.
FinOps, Cost Governance & Security
- Continuously monitor and optimize cloud costs (compute utilization, storage lifecycle, Compute usage, and data transfer) using cloud Cost Explorer, Budgets, Trusted Advisor, CloudHealth, or Kubecost.
- Implement security best practices for IAM, VPCs, security groups, NACLs, encryption, and manage secrets using KMS, SSM Parameter Store, or Vault.
- Build secure CI/CD pipelines with automated security checks, least-privilege access, audit logging, and ensure compliance readiness for ISO 27001, SOC2, and GDPR.
Collaboration, Leadership & Platform Culture
- Work closely with Payments engineering, product, and operations teams to drive a DevOps, SRE, GitOps, and automation-first culture.
- Mentor junior DevOps and Platform Engineers while creating and maintaining detailed runbooks, architecture diagrams, and platform documentation.
What You’ll Need to be Successful in this Role
- B.E or B.Tech in Computer Science or any equivalent degree.
- 8+ years of experience in DevOps, SRE, Platform Engineering, or Cloud Infrastructure Engineering.
- Strong expertise in Multi cloud architecture, services, and deep understanding of Kubernetes, containers, and cloud-native systems.
- Strong Infrastructure-as-Code expertise using Terraform, CloudFormation
Strong Linux administration, networking, DNS, routing, and load balancing knowledge.
- Strong scripting/programming experience in Python, Bash, or Go (preferred).
- Experience with CI/CD automation, GitOps workflows, and observability platforms supporting scalable production systems.
- Experience Payments infrastructure, Prepaid, Debit, UPI, FRM, Credit ..etc..and inference optimization.
- Familiarity with Kafka, Redis, Mulit DBs, and event-driven systems.
- Exposure to platform engineering, internal developer platforms, and tools like ArgoCD, Helm, and OpenTelemetry.
- Any Certifications: Solutions Architect, DevOps Engineer, or SysOps Administrator. Knowledge of distributed systems and large-scale platform operations.
Perks And Benefits
- Inclusive and People-First Culture
- Health & Wellness Programs
- Comprehensive Medical Insurance
- Recognition Programs
- Performance-based ESOPs
- Learning Opportunities