Requirements:
- Have a positive approach and work to enable and support those around you
- Works effectively in a team-based agile environment to monitor, log, resolve, and
escalate infrastructure issues
- Continually look for opportunities to develop solutions through automation;
participates in teams dedicated to continuous improvement
- You have worked as a DevOps or Site Reliability Engineer (or similar position) for
atleast 6-8 years
- Excellent critical, system-level thinking
- 5+ years of experience with Amazon Web Services
- Knowledge of Cloud and System State automation tools (Chef, Puppet, Ansible,
CloudFormation, Terraform)
- In-depth experience with Linux and strong networking comprehension
- Experience with scripting languages (bash, python, ruby)
- Experience with productivity tools and workflow models such as Jira, Scrum/Kanban, Confluence, Request Tracker, Asana, etc
- Great troubleshooting skills with the ability to diagnose issues quickly on the fly
- Current with industry trends and best practices
- Time and project management skills; able to prioritize and task switch as needed
- Team player and collaborator Strong communication skills; ability to communicate
in clear, concise, unambiguous terms when documenting and troubleshooting
- Experience in software development is a plus
Bonus Skills:
- Familiarity with container orchestration services, especially Kubernetes
- Prior experience with Docker
- Experience administering and deploying development CI/CD tools such as Git,
Jira, GitLab, Jenkins, GoCD, etc
- ISO 27001, security management protocol, intrusion detection, SOC 1/2 or
SSAE16
Requirements
Responsibilities:
- Maintain and build Bynder's global scaling infrastructure
- Troubleshoot and debug network, system, and application issues using tools such
as New Relic, Sumologic, packet capture data, and the Linux shell.
- Help the Development team in their workflow and streamline releases
- Advocate operational best practices to Product, Professional Services,
Development, and Support teams
- Drive process and company-wide communication, including post-mortems,
incident management, and project documentation
- Drive automation, monitoring and horizontal scalability of key systems
- Address high availability concerns, weak points, performance bottlenecks,
manually configured state, and information security issues
- Provide feedback and guidance on architecture proposals from across the
organization
- Ensure that proactive and efficient monitoring is in place for all our vital
microservices