AI Engineer & Researcher, Inference
SpeechifyFull Time
Junior (1 to 2 years)
Key technologies and capabilities for this role
Common questions about this position
The salary range is $180,000 - $440,000 USD.
The position is based in the Bay Area (San Francisco and Palo Alto).
Required experience includes system optimizations for model serving (e.g., batching, caching), low-level optimizations (e.g., GPU kernels), algorithmic optimizations (e.g., quantization), large-scale production serving, and testing/benchmarking of inference services. The tech stack involves Python/Rust, PyTorch/JAX, CUDA/CUTLASS/Triton/NCCL, Kubernetes, and SGLang.
The team is small, highly motivated, focused on engineering excellence, with a flat organizational structure where employees are hands-on, show initiative, and have strong communication, work ethic, and prioritization skills.
Highlight your CV and statements of exceptional work, especially in inference optimizations, production serving, and relevant tech stack experience, as the team reviews these first.
AI tools for research and information retrieval
x.ai develops AI tools aimed at enhancing research and information retrieval. Their main product, Grok, is designed to answer a variety of questions, including unconventional ones that other AI systems might not handle. Grok provides real-time knowledge, making it a useful resource for researchers, academics, and professionals who need quick access to relevant information. Unlike competitors, Grok stands out for its ability to suggest questions and provide nuanced answers, catering to a diverse range of inquiries. The goal of x.ai is to empower users by streamlining their research processes and fostering innovation through reliable information access.