Senior Enterprise Architect - AI Product Platform

OvalEdge • Full-time • Hyderabad, IN • 1d ago

About OvalEdge

OvalEdge is an enterprise-grade AI-powered Data Intelligence platform that unifies Data Catalog, Data Governance, Data Quality, and Analytics into a single product offered in both SaaS & on-prem versions. Trusted by Fortune 500 companies globally, OvalEdge enables organizations to discover, govern, and operationalize their data assets at scale — accelerating data-driven decisions while ensuring compliance and trust.

The Role

We are looking for a Senior Enterprise Architect to serve as the primary technical authority for the OvalEdge AI Product Platform. This is a hands-on leadership role — you will own the end-to-end architecture strategy for both our SaaS & on-prem platforms spanning Generative AI, Agentic AI, RAG, Data Governance, Cloud-native infrastructure, and Enterprise Integrations while working closely with domain leads for execution.

You have done this before. You have shipped AI capabilities to production — not just designed them on whiteboards. You are equally comfortable presenting architecture strategy to the C-suite and doing a deep-dive code review with an engineering team. You thrive in a fast-moving product company where architecture decisions have direct customer impact.

What You'll Do

Architecture Vision & Governance

• Own the technology architecture strategy and maintain a rolling 24-month roadmap aligned to product and business objectives.

• Establish and enforce architecture standards, design principles, and decision records across all engineering initiatives.

• Lead the Architecture council and chair the Architecture Review Board; conduct design reviews for all major platform initiatives.

• Identify and proactively retire technical debt; ensure extensibility for emerging AI technologies.

• Manage, scale and evolve architecture of one product across multiple platforms (SaaS and on-prem).

AI & Agentic Platform

• Design and evolve production-grade Agentic AI systems — multi-agent orchestration, hierarchical supervisor-agent frameworks, goal-oriented task decomposition, and long-running autonomous workflows.

• Define standards for Generative AI integration: multi-LLM routing, model abstraction layers, prompt and context engineering, token management, and cost optimization across OpenAI, Anthropic, Gemini, and open-source models.

• Architect RAG pipelines end-to-end: vector databases, embedding strategies, retrieval optimization, hallucination mitigation, and evaluation frameworks.

• Ensure Responsible AI practices — AI security, data privacy, bias controls, and governance compliance — are baked into every AI system design.

• Drive LLMOps maturity: model versioning, observability, drift detection, and automated evaluation in production.

• Define and enforce AI evaluation frameworks in production — including DeepEval, RAGAS, TruLens, and Promptfoo or similar tools — as quality gates within CI/CD pipelines, not as one-off experiments.

• Establish continuous LLM evaluation: automated regression testing for response quality, faithfulness, context precision, and answer relevance across model upgrades and prompt changes.

SaaS & Cloud Platform

• Own the architecture for OvalEdge's SaaS platform on AWS — ECS/EKS, Lambda, S3, RDS, OpenSearch, Bedrock, IAM, and CloudWatch, and our licensed on-prem platform.

• Design for elastic scaling, disaster recovery, cost efficiency, and 99.9%+ availability SLAs.

• Establish self-healing, self-monitoring, and auto-remediation capabilities; drive observability and reliability engineering maturity.

• Define containerization, CI/CD, and Infrastructure-as-Code standards across engineering teams.

Data & Analytics Architecture

• Design scalable data catalog, governance, and analytics architectures — semantic data layers, query optimization, in-memory analytics, and AI-assisted analysis pipelines.

• Architect MCP-based enterprise integration services: tool discovery, agent interoperability, REST and event-driven APIs, and partner SDK frameworks.

Engineering Excellence & Team Leadership

• Build and run a structured mentorship program across AI, Platform, and Architecture teams (15–50+ engineers) — paired mentoring, design-review shadowing, and direct 1:1 coaching — with the explicit goal of producing the next generation of architects from within.

• Drive a culture where architecture decisions are taught, not just enforced — raising the team's collective architectural IQ rather than creating a single point of expertise.

• Own bench strength: track high-potential engineers proactively and maintain a live succession map for all key architecture and platform leadership roles.

• Drive AI-assisted development, code generation tooling, and developer productivity improvements across the SDLC.

• Own the AI testing strategy end-to-end: unit-level prompt tests (Promptfoo), component-level RAG evaluation (DeepEval), and system-level agent behavior testing (LangSmith, Braintrust).

• Define LLM quality gates — hallucination rate, groundedness, toxicity, latency — that block releases automatically when thresholds are breached.

• Evaluate and standardize tooling across the AI eval stack: DeepEval for metric-based LLM unit tests, RAGAS for RAG pipeline scoring, TruLens for feedback functions, LangSmith for trace-level debugging, and Braintrust for dataset-driven regression testing.

Cross-Functional Collaboration

• Partner with QA leadership on test strategy — including AI-assisted testing, performance, security, and reliability testing.

• Work closely with Product Management to assess feasibility, define solution approaches, and shape the product roadmap.

• Engage executive stakeholders to communicate technology strategy, manage risk, and translate architectural decisions into business value.

• Security: Partner with the Security team on threat modeling, secure-by-design reviews, and AI-specific risk assessments — treating security sign-off as a release gate. Own compliance architecture for SOC 2, ISO 27001, and emerging AI governance standards.

• DevOps / Platform Engineering: Co-own CI/CD, SLOs, and incident response architecture with the DevOps Lead — platform and infrastructure decisions are made jointly, not handed over.

• Customer Success: Maintain a direct feedback loop with Customer Success — translate enterprise-scale performance, integration, and reliability signals from the field into architectural improvements before they become escalations.

What You'll Bring

Required

• 15+ years in software engineering and architecture; 8+ years in enterprise or product architecture leadership roles.

• Hands-on, demonstrated experience shipping AI/ML capabilities to production.

• 5+ years building and scaling multi-tenant SaaS products on cloud-native platforms (AWS preferred).

• 5+ years building and scaling licensed, on-prem products on multiple platforms.

• Experience in managing architecture for products deployed across SaaS and multiple, on-prem platforms.

• Deep expertise in Agentic AI frameworks (LangChain, LangGraph, CrewAI, AutoGen, MCP) and RAG architectures in production.

• Strong command of Generative AI ecosystem: LLM providers (OpenAI, Anthropic, Gemini), model abstraction, prompt engineering, and LLMOps.

• Expert-level proficiency in Python and Java; solid understanding of microservices, REST APIs, and event-driven systems.

• Proven track record leading cross-functional engineering teams through large-scale platform transformations.

• Experience with enterprise architecture frameworks (TOGAF or equivalent).

Nice to Have

• 3+ years designing and shipping Generative AI or Agentic AI products commercially.

• Experience with Data Catalog, Data Governance, Data Quality, or Analytics platforms.

• Exposure to vector databases (pgvector, Pinecone, Weaviate, OpenSearch) and semantic search in production.

• Experience with AWS Bedrock, SageMaker, or similar managed AI/ML services.

• Background supporting enterprise-scale, Fortune 500 customers.

• Master's degree in Computer Science, AI/ML, or a related field.

Education

• Bachelor's degree in Computer Science, Engineering, or a related field (required).

• Master's degree or AI/ML specialization (preferred).

What Success Looks Like

In your first 90 days, you will have assessed the current architecture, identified the top 3 risks, and delivered an initial 12-month roadmap. Within 12 months, the platform will demonstrate measurable improvements across these dimensions:

Platform Reliability - Uptime - > 99.9%

AI Quality - Agent success rate & RAG accuracy - Improving QoQ

Engineering Delivery - Roadmap predictability - ≥ 90%

Scalability - Customer growth support w/o redesign - 3× headroom

Cost Efficiency - AI & infra cost per transaction - Optimized YoY

Why OvalEdge

• Build at the intersection of AI and Data — one of the highest-impact domains in enterprise software.

• Greenfield AI architecture ownership with real production scale and Fortune 500 customer exposure.

• Collaborative, engineering-first culture with a direct line to product strategy and executive leadership.

• Competitive compensation, equity participation, and dynamic work environment.

Apply