Open AI

Software Engineer, Model Inference

Job Description

Posted on: 
January 11, 2023

We are looking for an engineer who wants to take the world's largest and most capable AI models and optimize them for use in a high-volume, low-latency, and high-availability production environment.

Responsibilities

  • Work alongside machine learning researchers, engineers, and product managers to bring our latest technologies into production.
  • Introduce new techniques, tools, and architecture that improve the performance, latency, throughput, and efficiency of our deployed models.
  • Build tools to give us visibility into our bottlenecks and sources of instability and then design and implement solutions to address the highest priority issues.
  • Optimize our code and fleet of Azure VMs to utilize every FLOP and every GB of GPU RAM of our hardware.

Job Requirements

  • Have an understanding of modern ML architectures and an intuition for how to optimize their performance, particularly for inference.
  • Own problems end-to-end, and are willing to pick up whatever knowledge you're missing to get the job done.
  • Have at least 3 years of professional software engineering experience.
  • Have or can quickly gain familiarity with PyTorch, NVidia GPUs and the software stacks that optimize them (e.g. NCCL, CUDA), as well as HPC technologies such as InfiniBand, MPI, etc.
  • Have experience architecting, observing, and debugging production distributed systems.
  • Have a humble attitude, an eagerness to help your colleagues, and a desire to do whatever it takes to make the team succeed.
  • Have needed to rebuild or substantially refactor production systems several times over due to rapidly increasing scale.
  • Are self-directed and enjoy figuring out the most important problem to work on.
  • Have a good intuition for when off-the-shelf solutions will work, and build tools to accelerate your own workflow quickly if they won’t.

Apply now

More job openings