Deep Learning Engineer, Datacenters

Bengaluru, Karnataka, India

Apply with AI Apply

Not SpecifiedCompensation

Junior (1 to 2 years)Experience Level

Full TimeJob Type

UnknownVisa

Semiconductors & Hardware, Artificial Intelligence & Machine Learning, Data Center EquipmentIndustries

Job Description

Position Overview

Location Type: Not Specified
Employment Type: Full Time
Job Type: Not Specified

NVIDIA is expanding its Datacenter business, and this role is central to optimizing datacenter deployments and establishing a data-driven approach to hardware design and software development. The team collaborates with various NVIDIA teams, including DL research, CUDA Kernel, and DL Framework development, as well as Silicon Architecture teams. This role focuses on understanding the relationships between CPU, GPU, networking, and I/O in the context of deep learning architectures for applications like Natural Language Processing, Computer Vision, and Autonomous Driving. The goal is to optimize next-generation systems and the Deep Learning Software Stack.

Responsibilities

Help develop software infrastructure to characterize and analyze a broad range of Deep Learning applications.
Evolve cost-efficient datacenter architectures tailored to meet the needs of Large Language Models (LLMs).
Work with experts to develop analysis and profiling tools in Python, bash, and C/C++ to measure key performance metrics of DL workloads running on Nvidia systems.
Analyze system and software characteristics of DL applications.
Develop analysis tools and methodologies to measure key performance metrics and to estimate potential for efficiency improvement.

Requirements

A Bachelor’s degree in Electrical Engineering or Computer Science with 3+ years of relevant experience (Master’s or PhD degree preferred).
Experience in at least one of the following:
- System Software: Operating Systems (Linux), Compilers, GPU Kernels (CUDA), DL Frameworks (PyTorch, TensorFlow).
- Silicon Architecture and Performance Modeling/Analysis: CPU, GPU, Memory, or Network Architecture
Programming experience in C/C++ and Python.
Exposure to Containerization Platforms (Docker) and Datacenter Workload Managers (Slurm) is a plus.
Demonstrated ability to work in virtual environments and a strong drive to own tasks from beginning to end.

Ways to Stand Out

Background with system software, operating system intrinsics, GPU kernels (CUDA), or DL Frameworks (PyTorch, TensorFlow).
Experience with silicon performance monitoring or profiling tools (e.g., perf, gprof, nvidia-smi, dcgm).
In-depth performance modeling experience in any one of CPU, GPU, Memory, or Network Architecture.
Exposure to Containerization Platforms (Docker) and Datacenter Workload Managers (Slurm).
Prior experience with multi-site teams or multi-functional teams.

Skills

Linux

CUDA

PyTorch

TensorFlow

Python

C++

Containerization

Docker

Slurm

Performance Modeling

System Software

Deep Learning

Hardware Architecture

NVIDIA

Designs GPUs and AI computing solutions

About NVIDIA

NVIDIA designs and manufactures graphics processing units (GPUs) and system on a chip units (SoCs) for various markets, including gaming, professional visualization, data centers, and automotive. Their products include GPUs tailored for gaming and professional use, as well as platforms for artificial intelligence (AI) and high-performance computing (HPC) that cater to developers, data scientists, and IT administrators. NVIDIA generates revenue through the sale of hardware, software solutions, and cloud-based services, such as NVIDIA CloudXR and NGC, which enhance experiences in AI, machine learning, and computer vision. What sets NVIDIA apart from competitors is its strong focus on research and development, allowing it to maintain a leadership position in a competitive market. The company's goal is to drive innovation and provide advanced solutions that meet the needs of a diverse clientele, including gamers, researchers, and enterprises.

Santa Clara, CaliforniaHeadquarters

1993Year Founded

$19.5MTotal Funding

IPOCompany Stage

Automotive & Transportation, Enterprise Software, AI & Machine Learning, GamingIndustries

10,001+Employees

Benefits

Company Equity

401(k) Company Match

Risks

Increased competition from AI startups like xAI could challenge NVIDIA's market position.

Serve Robotics' expansion may divert resources from NVIDIA's core GPU and AI businesses.

Integration of VinBrain may pose challenges and distract from NVIDIA's primary operations.

Differentiation

NVIDIA leads in AI and HPC solutions with cutting-edge GPU technology.

The company excels in diverse markets, including gaming, data centers, and autonomous vehicles.

NVIDIA's cloud services, like CloudXR, offer scalable solutions for AI and machine learning.

Upsides

Acquisition of VinBrain enhances NVIDIA's AI capabilities in the healthcare sector.

Investment in Nebius Group boosts NVIDIA's AI infrastructure and cloud platform offerings.

Serve Robotics' expansion, backed by NVIDIA, highlights growth in autonomous delivery services.

Land your dream remote job 3x faster with AI

Try Jobo Free

Deep Learning Engineer, Datacenters