Senior Software Engineer-Distributed Inference
NVIDIA- Full Time
- Senior (5 to 8 years)
Candidates should have significant problem-solving experience in PyTorch, CUDA, and distributed systems. They must have experience training large models using Python and PyTorch, along with practical experience working with the entire development pipeline from data processing to inference. Additionally, candidates should have experience optimizing and deploying inference workloads for throughput and latency, profiling CPU and GPU code in PyTorch, and writing high-performance parallel C++ code. Familiarity with high-performance Triton/CUDA and writing custom PyTorch kernels is also required. Good to have experience with deep learning concepts such as Transformers and multimodal generative models like Diffusion Models and GANs, as well as building inference/demo prototype code including Gradio and Docker.
The Senior Research Engineer will ensure efficient implementation of models and systems for data processing, training, inference, and deployment. They will identify and implement optimization techniques for massively parallel and distributed systems, remedy efficiency bottlenecks by profiling and implementing high-performance CUDA, Triton, C++, and PyTorch code. The engineer will work closely with the research team to ensure systems are planned for maximum efficiency from start to finish and build tools to visualize, evaluate, and filter datasets. They will also implement cutting-edge product prototypes based on multimodal generative AI.
Develops multimodal AI technologies for creativity
Luma AI develops multimodal artificial intelligence technologies that enhance human creativity and capabilities. Their main product, the Dream Machine, allows users to interact with various types of data, enabling creative professionals, businesses, and developers to explore innovative applications of AI. Unlike many competitors, Luma AI focuses on integrating multiple modes of interaction, which broadens the possibilities for users. The company operates on a subscription model, providing access to its AI tools and services, and aims to lead the way in AI-driven creativity and productivity.