Site Reliability Engineering Manager, Provisioning Platforms
Location
New York, NY
Level
Senior
Department
Engineering
Type
Full - Time
Salary
Job Description
Posted on:
February 7, 2023
We’re looking for a Site Reliability Engineering Manager to help us accomplish our mission to improve and extend lives by learning from the experience of every person with cancer.
Responsibilities
- Manage a team of 3 SREs and oversee the work of 1 embedded Security Engineer
- Own the end-to-end reliability and user experience of provisioning cloud resources at Flatiron
- Partner with peer and business unit teams on higher level abstractions and platforms on top of Terraform and EKS
- Define team norms around areas like on-call, work planning, and root cause analysis
- Help champion key SRE tenants like Reliability and Observability across Flatiron
Job Requirements
- You have 1-2 years of experience managing small teams
- You are experienced with building and operating observable and reliable systems
- You are experienced with defining and measuring system health, and in working with stream aligned teams to ensure SLOs are met
- You are experienced in working with Kubernetes
- You are experienced negotiating with nontechnical stakeholders and driving business value with technical solutions
- You are a strong, transparent communicator
- You seek simple approaches to complex solutions