SENIOR, STAFF, AND LEAD SITE RELIABILITY ENGINEERS @ MONGODB
MongoDB is seeking highly skilled Site Reliability Engineers at Senior, Staff, and Lead levels to join our infrastructure team. This role focuses on building and managing large-scale infrastructure with a specialized emphasis on database systems, ensuring that our platform remains performant and reliable for millions of users worldwide. You will work on genuinely cool technical challenges involving massive data sets and distributed systems architecture.
The position is offered as a hybrid or remote role, specifically targeting candidates in Ireland to align with regulatory and timezone requirements. Candidates will be responsible for implementing automation, managing cloud-native infrastructure, and collaborating with cross-functional teams to improve our overall reliability posture. This is an excellent opportunity for engineers who enjoy the intersection of infrastructure and database technologies at scale.
The position is offered as a hybrid or remote role, specifically targeting candidates in Ireland to align with regulatory and timezone requirements. Candidates will be responsible for implementing automation, managing cloud-native infrastructure, and collaborating with cross-functional teams to improve our overall reliability posture. This is an excellent opportunity for engineers who enjoy the intersection of infrastructure and database technologies at scale.
Key Requirements
Extensive professional experience in Site Reliability Engineering or DevOps roles.
Deep understanding of database systems management and optimization.
Proficiency in programming or scripting languages such as Go, Python, or Bash.
Hands-on experience with managing large-scale infrastructure on cloud platforms like AWS, GCP, or Azure.
Expertise in container orchestration tools, specifically Kubernetes and Docker.
Strong knowledge of infrastructure-as-code tools such as Terraform or Ansible.
Experience with monitoring and observability frameworks like Prometheus and Grafana.
Proven ability to manage and scale distributed systems in a production environment.
Excellent problem-solving skills and experience participating in on-call rotations.
Strong communication skills and the ability to work effectively in a remote/hybrid team setting.