Senior Software Engineer-Distributed Inference
NVIDIA- Full Time
- Senior (5 to 8 years)
Candidates must possess a Master's or PhD in Computer Science, Electrical Engineering, or a related field, along with at least 5 years of experience in optimizing deep learning models in a production environment. Strong programming skills in Python and C++ are required, as well as experience in training large models using Python with PyTorch and/or TensorFlow. A proven track record of optimizing large-scale models with over 10 billion parameters and a deep understanding of GPU architecture and CUDA programming is essential. Candidates should also have experience in the entire development pipeline from data processing to model inference and optimizing inference workloads for throughput and latency.
The Senior AI Performance Engineer will analyze and optimize the performance of massively parallel and distributed systems, implementing and fine-tuning distributed training strategies for multi-GPU and multi-node environments. They will implement high-performance CUDA, Triton, C++, and PyTorch code, profile model performance to identify bottlenecks using tools like NVIDIA NSight Systems and PyTorch Profiler, and develop and maintain benchmarking suites for continuous performance monitoring.
AI tools for multimedia content creation
Genmo.ai specializes in providing AI tools for generating and editing multimedia content, including images, videos, and presentations. Users can upload images and animate specific parts, like transforming a static sky into a timelapse, or create entire movies by refining ideas, generating scenes, and selecting transitions. The platform caters to both individual content creators and businesses, operating on a subscription model with various service tiers. Genmo.ai differentiates itself by continuously enhancing its technology and focusing on user intent, ensuring that clients have powerful tools to realize their creative projects.