DL System Software Engineer - AI Platform at NVIDIA

Toronto, Ontario, Canada

NVIDIA Logo
Not SpecifiedCompensation
Mid-level (3 to 4 years), Senior (5 to 8 years)Experience Level
Full TimeJob Type
UnknownVisa
Technology, AIIndustries

Requirements

  • Bachelor's degree or equivalent experience in Computer Science, Computer Engineering, or relevant technical field
  • 5+ years of experience
  • Experience building large scale systems from scratch
  • Prior experience in container-based deployment systems like Kubernetes is beneficial
  • Strong coding skills in programming languages like Python, Go, Rust and/or C/C++
  • Solid foundation in computer science and computer engineering topics: algorithms and data structures, operating systems, computer architecture, etc
  • Strong understanding of AI and related technologies is a huge plus
  • Ability to quickly grasp new concepts and thrive in evolving situations
  • Ways to stand out
  • Graduate-level education or relevant practical background, particularly in research, is beneficial
  • Practical experience in building and optimizing AI applications is highly desired
  • Proficiency in container software such as containerd, CRI-O, Linux namespace, CRIU, and NVIDIA GPU technology such as CUDA graphs, Driver/runtime is greatly advantageous

Responsibilities

  • Taking part in the development of NVIDIA's AI platform for training, fine-tuning and serving latest AI models with the best performance and efficiency
  • Designing and building solutions for scheduling large scale AI training and inference workloads on GPU clusters over many cloud infrastructure
  • Exploring and finding solutions for open problems like industry-scale resource management, GPU scheduling, performance prediction, and live workload migration
  • Work with and contribute to adjacent teams like TensorRT/Dynamo inference engine, ML compiler, KAI/Grove scheduler, Lepton cloud etc

Skills

Python
Go
Rust
C/C++
Kubernetes
TensorRT
ML Compiler
GPU Clusters
Cluster Scheduler
Operating Systems
Algorithms
Data Structures
Computer Architecture

NVIDIA

Designs GPUs and AI computing solutions

About NVIDIA

NVIDIA designs and manufactures graphics processing units (GPUs) and system on a chip units (SoCs) for various markets, including gaming, professional visualization, data centers, and automotive. Their products include GPUs tailored for gaming and professional use, as well as platforms for artificial intelligence (AI) and high-performance computing (HPC) that cater to developers, data scientists, and IT administrators. NVIDIA generates revenue through the sale of hardware, software solutions, and cloud-based services, such as NVIDIA CloudXR and NGC, which enhance experiences in AI, machine learning, and computer vision. What sets NVIDIA apart from competitors is its strong focus on research and development, allowing it to maintain a leadership position in a competitive market. The company's goal is to drive innovation and provide advanced solutions that meet the needs of a diverse clientele, including gamers, researchers, and enterprises.

Santa Clara, CaliforniaHeadquarters
1993Year Founded
$19.5MTotal Funding
IPOCompany Stage
Automotive & Transportation, Enterprise Software, AI & Machine Learning, GamingIndustries
10,001+Employees

Benefits

Company Equity
401(k) Company Match

Risks

Increased competition from AI startups like xAI could challenge NVIDIA's market position.
Serve Robotics' expansion may divert resources from NVIDIA's core GPU and AI businesses.
Integration of VinBrain may pose challenges and distract from NVIDIA's primary operations.

Differentiation

NVIDIA leads in AI and HPC solutions with cutting-edge GPU technology.
The company excels in diverse markets, including gaming, data centers, and autonomous vehicles.
NVIDIA's cloud services, like CloudXR, offer scalable solutions for AI and machine learning.

Upsides

Acquisition of VinBrain enhances NVIDIA's AI capabilities in the healthcare sector.
Investment in Nebius Group boosts NVIDIA's AI infrastructure and cloud platform offerings.
Serve Robotics' expansion, backed by NVIDIA, highlights growth in autonomous delivery services.

Land your dream remote job 3x faster with AI