Senior Deep Learning Systems Engineer, Datacenters at NVIDIA

Santa Clara, California, United States

NVIDIA Logo
Not SpecifiedCompensation
Senior (5 to 8 years)Experience Level
Full TimeJob Type
UnknownVisa
Technology, Artificial Intelligence, DatacenterIndustries

Requirements

  • Bachelor’s degree in Electrical Engineering or Computer Science or equivalent experience (Masters or PhD degree preferred)
  • 8 years or more of relevant experience
  • Experience in at least one of the following: System Software (Operating Systems (Linux), Compilers, GPU kernels (CUDA), DL Frameworks (PyTorch, TensorFlow)); Silicon Architecture and Performance Modeling/Analysis (CPU, GPU, Memory or Network Architecture)
  • Experience programming in C/C++ and Python
  • Deep understanding of computer system architecture and performance analysis, with demonstrated hands-on experience
  • Demonstrated ability to work in virtual environments and strong drive to own tasks from beginning to end

Responsibilities

  • Help develop software infrastructure to characterize and analyze a broad range of Deep Learning applications
  • Evolve cost-efficient datacenter architectures tailored to meet the needs of Large Language Models (LLMs)
  • Work with experts to help develop analysis and profiling tools in Python, bash and C++ to measure key performance metrics of DL workloads running on Nvidia systems
  • Analyze system and software characteristics of DL applications
  • Develop analysis tools and methodologies to measure key performance metrics and to estimate potential for efficiency improvement

Skills

Key technologies and capabilities for this role

PythonC++BashLinuxCUDAPyTorchTensorFlowGPUPerformance ModelingOperating SystemsCompilersDeep LearningProfiling ToolsNetworking

Questions & Answers

Common questions about this position

What education and experience are required for this Senior Deep Learning Systems Engineer role?

A Bachelor’s degree in Electrical Engineering or Computer Science or equivalent experience is required, with Masters or PhD preferred, along with 8 years or more of relevant experience.

What key skills and experiences are needed for this position?

Experience in System Software such as Operating Systems (Linux), Compilers, GPU kernels (CUDA), DL Frameworks (PyTorch, TensorFlow), or Silicon Architecture and Performance Modeling/Analysis for CPU, GPU, Memory or Network is required, plus programming in C/C++ and Python.

Is this a remote position, or does it require office work?

This information is not specified in the job description.

What is the salary or compensation for this role?

This information is not specified in the job description.

What makes a candidate stand out for this Deep Learning Systems Engineer position?

Candidates stand out with background in system software, OS intrinsics, GPU kernels (CUDA), or DL Frameworks; experience with silicon performance tools like perf, gprof, nvidia-smi; in-depth performance modeling in CPU/GPU/Memory/Network; and exposure to docker and slurm. Prior experience working in virtual environments also helps.

NVIDIA

Designs GPUs and AI computing solutions

About NVIDIA

NVIDIA designs and manufactures graphics processing units (GPUs) and system on a chip units (SoCs) for various markets, including gaming, professional visualization, data centers, and automotive. Their products include GPUs tailored for gaming and professional use, as well as platforms for artificial intelligence (AI) and high-performance computing (HPC) that cater to developers, data scientists, and IT administrators. NVIDIA generates revenue through the sale of hardware, software solutions, and cloud-based services, such as NVIDIA CloudXR and NGC, which enhance experiences in AI, machine learning, and computer vision. What sets NVIDIA apart from competitors is its strong focus on research and development, allowing it to maintain a leadership position in a competitive market. The company's goal is to drive innovation and provide advanced solutions that meet the needs of a diverse clientele, including gamers, researchers, and enterprises.

Santa Clara, CaliforniaHeadquarters
1993Year Founded
$19.5MTotal Funding
IPOCompany Stage
Automotive & Transportation, Enterprise Software, AI & Machine Learning, GamingIndustries
10,001+Employees

Benefits

Company Equity
401(k) Company Match

Risks

Increased competition from AI startups like xAI could challenge NVIDIA's market position.
Serve Robotics' expansion may divert resources from NVIDIA's core GPU and AI businesses.
Integration of VinBrain may pose challenges and distract from NVIDIA's primary operations.

Differentiation

NVIDIA leads in AI and HPC solutions with cutting-edge GPU technology.
The company excels in diverse markets, including gaming, data centers, and autonomous vehicles.
NVIDIA's cloud services, like CloudXR, offer scalable solutions for AI and machine learning.

Upsides

Acquisition of VinBrain enhances NVIDIA's AI capabilities in the healthcare sector.
Investment in Nebius Group boosts NVIDIA's AI infrastructure and cloud platform offerings.
Serve Robotics' expansion, backed by NVIDIA, highlights growth in autonomous delivery services.

Land your dream remote job 3x faster with AI