Chief Architect Tools And Integration
Job Location: Mumbai/Pune/Bangalore/Gurgaon/Chennai
Position Overview:
The Chief Tool Architect is responsible for designing and optimising IT tools and systems that enhance infrastructure and application performance monitoring, observability, and automation. This role focuses on delivering state-of-the-art integration frameworks for multi-cloud environments, IoT, and ITSM tools while ensuring real-time visibility into infrastructure and application health. Key areas include APM, API integration, and driving innovation to improve operational efficiency and scalability.
Overall accountability & responsibility
Infrastructure and Application Observability:
- Architect end-to-end observability solutions for IT infrastructure and critical applications using tools like Splunk, Dynatrace, AppDynamics, and Moogsoft.
- Enable monitoring for key application performance metrics (APM) such as response times, throughput, latency, and error rates across hybrid cloud and on-prem environments.
- Develop and maintain real-time dashboards that offer actionable insights into infrastructure and application health.
- Establish frameworks for anomaly detection and root cause analysis leveraging machine learning and AIOps platforms.
Performance Monitoring Strategy:
- Design and implement strategies for monitoring application and system performance with a focus on predictive analytics and proactive remediation.
- Collaborate with DevOps and development teams to define performance benchmarks and ensure alignment with business SLAs.
- Lead initiatives to optimize the performance of APIs, databases, microservices, and containerized applications.
API Integration and Workflows:
- Develop API frameworks to integrate multi-cloud platforms (AWS, Azure, GCP) with enterprise IoT systems and ITSM tools.
- Build scalable APIs for seamless communication between monitoring tools, applications, and automation systems.
- Enable API-driven data sharing for unified observability and incident management.
Automation and Orchestration:
- Lead automation projects leveraging Ansible, Python, and Terraform to improve application deployment, monitoring, and incident remediation workflows.
- Integrate APM tools with ITSM platforms like ServiceNow, BMC Remedy, and Cherwell to streamline incident lifecycle management and change requests.
Tool Strategy and Multi-ITSM Integration:
- Develop strategic roadmaps for integrating application and infrastructure monitoring tools into unified observability platforms.
- Drive alignment between ITSM tools (e.g., ServiceNow, Remedy) and APM solutions for comprehensive incident resolution and reporting.
- Evaluate and implement cutting-edge solutions to enhance both IT and application observability.
Team Leadership and Collaboration:
- Mentor and upskill teams on APM best practices, multi-tool integrations, and observability technologies.
- Collaborate with business and technical stakeholders to ensure alignment on application and infrastructure performance goals.