DevOps Engineer

Sketch Brahma Technologies • Full-time • Bengaluru, IN • 5m ago

Job Overview:

We are looking for a skilled Site Reliability Engineer (SRE) to join our team. The SRE will play a critical role in maintaining the reliability, performance, and scalability of our services. This role involves working with cloud platforms such as AWS, Azure, and Oracle, managing Ubuntu-based systems, and ensuring seamless operation of our infrastructure. The ideal candidate will have a strong background in system administration, cloud technologies, and modern DevOps practices.

Key Responsibilities:

● Infrastructure Management:

○ Design, implement, and manage scalable, resilient, and secure infrastructure on cloud providers such as AWS, Azure, and Oracle.

○ Oversee the administration of Ubuntu servers, ensuring optimal performance and uptime.

● Automation and Monitoring:

○ Implement monitoring and alerting systems to proactively identify and resolve issues before they impact users.

○ Automate repetitive tasks to improve system reliability and operational efficiency.

● Containerization and Orchestration:

○ Deploy and manage containerized applications using Docker.

○ Utilize Kubernetes for container orchestration, ensuring efficient and reliable application deployment and scaling.

● Performance Optimization:

○ Analyze system performance metrics and optimize infrastructure to meet performance targets.

○ Troubleshoot and resolve issues related to server performance, network latency, and other system bottlenecks.

● Collaboration and Support:

○ Work closely with development teams to ensure new applications and features are designed with reliability and scalability in mind.

○ Provide guidance and mentorship to junior engineers on best practices for system reliability and cloud management.

○ Participate in on-call rotations to provide 24/7 support for critical issues.

● Security and Compliance:

○ Implement security best practices across all infrastructure components, including firewalls, VPNs, and access controls.

○ Ensure compliance with industry standards and internal policies for data protection and privacy.

Technical Skills:

● Proven experience with cloud providers: AWS, Azure, and Oracle.

● Strong proficiency in managing and troubleshooting Ubuntu operating systems.

● Hands-on experience with Nginx, Kubernetes, and Docker.

● Familiarity with scripting languages (e.g., Bash, Python) for automation tasks.

● Experience with CI/CD pipelines and tools like Jenkins, GitLab CI, or equivalent.

● Knowledge of networking fundamentals and security best practices.

Professional Experience:

● 2+ years of experience in a Site Reliability Engineer or similar role.

● Excellent problem-solving skills and attention to detail.

● Strong communication skills, with the ability to collaborate effectively with cross-functional teams.

● Self-motivated with the ability to work independently and as part of a team.

Apply