Senior Software Engineer-Distributed Inference
NVIDIA- Full Time
- Senior (5 to 8 years)
Candidates should possess deep expertise in computer architecture, operating systems, algorithms, hardware-software interfaces, and parallel/distributed computing, along with mastery of system-level programming (C++, Rust, or similar) with emphasis on low-level optimizations and hardware-aware design. They should excel at profiling and optimizing systems for latency, throughput, and efficiency, with zero tolerance for wasted cycles or resources, and be committed to automated testing and CI/CD pipelines.
The Senior Staff Software Engineer will build and operate real-time, distributed compute frameworks and runtimes to deliver planet-scale inference for LLMs and advanced AI applications at ultra-low latency, optimized for heterogeneous hardware and dynamic global workloads. They will develop deterministic, low-overhead hardware abstractions for thousands of synchronously coordinated GroqChips across a software-scheduled interconnection network, prioritizing fault tolerance, real-time diagnostics, ultra-low-latency execution, and mission-critical reliability. Furthermore, they will future-proof Groq’s software stack for next-gen silicon, innovative multi-chip topologies, emerging form factors, and heterogeneous co-processors, and foster collaboration across cloud, compiler, infra, data centers, and hardware teams to align engineering efforts and drive progress toward shared goals.
AI inference technology for scalable solutions
Groq specializes in AI inference technology, providing the Groq LPU™, which is known for its high compute speed, quality, and energy efficiency. The Groq LPU™ is designed to handle AI processing tasks quickly and effectively, making it suitable for both cloud and on-premises applications. Unlike many competitors, Groq's products are designed, fabricated, and assembled in North America, which helps maintain high standards of quality and performance. The company targets a variety of clients across different industries that require fast and efficient AI processing capabilities. Groq's goal is to deliver scalable AI inference solutions that meet the growing demands for rapid data processing in the AI and machine learning market.