Senior Software Engineer-Distributed Inference
NVIDIA- Full Time
- Senior (5 to 8 years)
Candidates should have expertise in CUDA or OpenCL, demonstrated experience developing CUDA kernels or equivalent technologies, and proficiency in Python for AI and performance optimization tasks. Experience with deep learning frameworks such as PyTorch or TensorFlow is essential, along with a strong understanding of CPU and GPU architecture to analyze and optimize performance at the hardware level.
The Senior Staff or Principal AI Performance Engineer will optimize inference engines to improve performance in engines such as VLLM, enhance scalable AI infrastructure, and implement optimizations that accelerate AI inference. They will develop and deploy CUDA kernels for deep learning workloads, conduct performance analysis to resolve bottlenecks, engage with the AI research community, improve onboarding and documentation, and collaborate cross-functionally with AI researchers, engineers, and infrastructure teams.
Utilizes wasted energy for computing power
Crusoe Energy Systems Inc. provides digital infrastructure that focuses on using wasted, stranded, or clean energy sources to power high-performance computing and artificial intelligence. The company helps clients in the technology and energy sectors by offering scalable computing solutions that aim to reduce greenhouse gas emissions and support the transition to cleaner energy. Crusoe's approach involves converting excess natural gas and renewable energy into computing power, which allows them to maximize resource efficiency while minimizing environmental impact. Unlike many competitors, Crusoe specifically targets the intersection of energy and technology, generating revenue by supplying computing resources to enterprises that need significant computational power for applications like AI and machine learning, along with providing technical support.