Senior Software Engineer - Distributed Inference
NVIDIAFull Time
Senior (5 to 8 years)
Key technologies and capabilities for this role
Common questions about this position
The position is onsite.
This information is not specified in the job description.
The role requires expertise in architecting inference performance, developing fused kernels for transformers, model mapping strategies like tensor and expert parallelism, hardware-software co-design for inference algorithms, and building scalable high-performance teams.
Etched.ai features a high-performing team of leading engineers focused on pioneering AI inference innovations, backed by top-tier investors, with an emphasis on delivering exceptional performance and rapid development cycles.
Strong candidates have experience leading teams to optimize inference kernels for transformer models, implementing advanced techniques like fused kernels and parallelism, and delivering production-ready implementations rapidly, as indicated by the 'You may be a good fit if you have' section.
Develops servers for transformer inference
The company specializes in developing powerful servers for transformer inference, utilizing transformer architecture integrated into their chips to achieve highly efficient and advanced technology. The main technologies used in the product are transformer architecture and advanced chip integration.