Flatiron

Site Reliability Engineering Manager, Provisioning Platforms

Job Description

Posted on: 
February 7, 2023

We’re looking for a Site Reliability Engineering Manager to help us accomplish our mission to improve and extend lives by learning from the experience of every person with cancer.

Responsibilities

  • Manage a team of 3 SREs and oversee the work of 1 embedded Security Engineer
  • Own the end-to-end reliability and user experience of provisioning cloud resources at Flatiron
  • Partner with peer and business unit teams on higher level abstractions and platforms on top of Terraform and EKS
  • Define team norms around areas like on-call, work planning, and root cause analysis
  • Help champion key SRE tenants like Reliability and Observability across Flatiron

Job Requirements

  • You have 1-2 years of experience managing small teams
  • You are experienced with building and operating observable and reliable systems
  • You are experienced with defining and measuring system health, and in working with stream aligned teams to ensure SLOs are met
  • You are experienced in working with Kubernetes
  • You are experienced negotiating with nontechnical stakeholders and driving business value with technical solutions
  • You are a strong, transparent communicator
  • You seek simple approaches to complex solutions
Apply now

More job openings