Senior Research Engineer - Enterprise Products
NVIDIAFull Time
Senior (5 to 8 years)
Candidates should possess proven experience deploying deep learning models on GPU-based infrastructure, including NVIDIA GPUs, CUDA, and TensorRT. Strong knowledge of containerization (Docker, Kubernetes) and microservice architectures for ML model serving is required, along with proficiency in Python and at least one deep learning framework such as PyTorch or TensorFlow. Familiarity with compression techniques like quantization, pruning, and distillation, and experience profiling and optimizing model inference are also necessary.
The Machine Learning Engineer will develop high-performance GPU-based inference pipelines for large multimodal diffusion models, build and maintain serving infrastructure for low-latency predictions at scale, collaborate with DevOps teams on containerization and autoscaling, leverage techniques like quantization and pruning to optimize model performance, design and maintain automated CI/CD pipelines for model deployment, explore cutting-edge GPU acceleration frameworks, and implement robust monitoring and alerting systems.
Video captioning and translation services
Captions.ai enhances video content by providing captioning and translation services tailored for content creators, social media influencers, marketing agencies, and businesses. Their main offerings include automatic subtitle generation, translation into 28 languages, and video compression to improve performance. These tools simplify the video production process, allowing users to produce professional-quality videos with ease. Unlike many competitors, Captions.ai uses a freemium model, offering basic services for free while charging for advanced features, which helps attract a large user base and convert free users into paying customers. The company's goal is to make high-quality video content accessible to a wider audience, and recent funding will support their growth and product development.