Staff Research Engineer, Model Efficiency
CohereFull Time
Senior (5 to 8 years), Expert & Leadership (9+ years)
San Francisco, California, United States
Key technologies and capabilities for this role
Common questions about this position
The salary range is $190,000 - $250,000, with final offer amounts determined by factors including experience and expertise.
The position is remote.
Required skills include experience with ML systems and deep learning frameworks like PyTorch, TensorFlow, ONNX; familiarity with LLM architectures and inference optimizations such as continuous batching and quantization; and understanding of GPU architectures or CUDA programming. The current stack involves Python, C++, TensorRT-LLM, and Kubernetes.
Benefits include comprehensive health, dental, and vision insurance for you and your dependents, plus a 401(k) plan. Equity is also part of the total compensation package in addition to base salary.
Strong candidates will have hands-on experience with ML systems, deep learning frameworks, LLM inference optimizations, and GPU programming, along with familiarity with the company's stack including Python, C++, TensorRT-LLM, and Kubernetes.
Advanced answer engine providing reliable information
Perplexity AI provides an advanced answer engine that delivers accurate and reliable responses to user queries. The platform uses current sources to ensure the information is both precise and relevant. It caters to a wide audience, including individuals looking for quick answers and businesses needing detailed information. Unlike many competitors, Perplexity AI emphasizes high-quality, source-backed answers, making it a valuable resource for users seeking trustworthy data. The company's goal is to meet the increasing demand for immediate access to reliable information, generating revenue through subscription fees, advertising, and partnerships.