ML Performance Engineer
Serve RoboticsFull Time
Senior (5 to 8 years), Mid-level (3 to 4 years)
Candidates must have strong expertise in the Python ecosystem and major ML frameworks like PyTorch and JAX, along with experience in lower-level programming languages such as C++ or Rust. A deep understanding of GPU acceleration, including CUDA, profiling, and kernel-level optimization, is essential, with TPU experience being a strong plus. Proven ability to accelerate deep learning workloads using compiler frameworks, graph optimizations, and parallelization strategies is required, as is a solid understanding of the deep learning lifecycle from model design to inference deployment. Strong debugging, profiling, and optimization skills in large-scale distributed environments are necessary, alongside excellent communication and collaboration skills.
The Senior Research Engineer will investigate and mitigate performance bottlenecks in large-scale distributed training and inference systems. They will develop and implement both low-level and high-level optimization strategies, and translate research models and prototypes into highly optimized, production-ready inference systems. Responsibilities include exploring and integrating inference compilers, designing, testing, and deploying scalable solutions for parallel and distributed workloads on heterogeneous hardware, and facilitating knowledge transfer between Research and Engineering teams.
Speech recognition and audio intelligence solutions
AssemblyAI specializes in Speech AI technology, focusing on automatic speech recognition (ASR) and audio intelligence. Their main product is an API that allows businesses to transcribe audio and video content, detect speakers, analyze sentiment, and redact personally identifiable information (PII). This API enables clients to integrate these capabilities into their own applications, providing accurate and scalable speech-to-text solutions. Unlike many competitors, AssemblyAI emphasizes continuous improvement of their AI models, backed by a team of research leaders and engineers. Their goal is to help businesses unlock the potential of voice data, making it easier to derive insights and build innovative applications.