Senior Rust Engineer
PhantomFull Time
Senior (5 to 8 years)
Candidates should possess strong proficiency in Rust and/or C++, familiarity with PyTorch and/or JAX, and experience designing/optimizing collectives such as NCCL, MPI collectives, and XLA collectives. Solid systems knowledge, including Linux internals, accelerator architectures (e.g., GPUs, TPUs), high-speed interconnects (e.g., NVLink, InfiniBand) and RDMAS are required, along with a strong understanding of distributed systems concepts, algorithms, and challenges. Experience analyzing performance traces and logs from distributed systems and ML workloads is also necessary.
The Inference Software Engineer - Collectives will formalize and optimize collectives (e.g. Send/Recieve, AllReduce, Broadcast, etc.), collaborate across systems and research teams to bring MoE architectures to Sohu’s runtime, optimize expert routing and communication layers using Sohu’s collectives, contribute to scaling and enhancing Sohu’s runtime, including multi-node inference, intra-node execution, state management, and robust error handling, and develop tools for performance profiling and debugging, identifying bottlenecks and correctness issues.
Develops servers for transformer inference
The company specializes in developing powerful servers for transformer inference, utilizing transformer architecture integrated into their chips to achieve highly efficient and advanced technology. The main technologies used in the product are transformer architecture and advanced chip integration.