NVIDIA

Deep Learning Solutions Architect – Large Scale Inference Optimization

United Kingdom

Not SpecifiedCompensation
Senior (5 to 8 years), Expert & Leadership (9+ years)Experience Level
Full TimeJob Type
UnknownVisa
Artificial Intelligence, Computer Hardware, SemiconductorsIndustries

Solution Architect - Deep Learning Inference

Employment Type: Full time

Position Overview

NVIDIA’s Worldwide Field Operations (WWFO) team is seeking a Solution Architect with a strong focus on Deep Learning and a deep understanding of neural network inference. With the introduction of NVIDIA Grace CPUs and Grace-Hopper / Grace-Blackwell systems, the CPU has become more tightly integrated into the AI platform than ever before. Innovations such as Chip-to-Chip NVLINK and the significant expansion of the NVLINK domain have enabled a wide range of new neural network architectures and approaches to training and inference.

The ideal candidate will be proficient using tools such as TRT-LLM, vLLM, SGLang or similar, and have strong systems knowledge, enabling customers to fully use the capabilities of the new GB200 NVL72 systems (for example, help customers embrace disaggregated inference, work on efficient KV cache offloading, or help with inference of new architectures like hybrid or diffusion models).

Solutions Architects work with the most exciting computing hardware and software, driving the latest breakthroughs in artificial intelligence! We need individuals who can enable customer productivity and develop lasting relationships with our technology partners, making NVIDIA an integral part of end-user solutions. We are looking for someone always passionate about artificial intelligence, someone who can maintain understanding of a fast-paced field, someone able to coordinate efforts between corporate marketing, industry business development, and engineering.

Responsibilities

  • Work directly with key customers to understand their technology and provide the best AI solutions.
  • Perform in-depth analysis and optimization to ensure the best performance on GPU architecture systems (in particular Grace/ARM based systems). This includes support in optimization of large scale inference pipelines.
  • Partner with Engineering, Product, and Sales teams to develop and plan the most suitable solutions for customers.
  • Enable development and growth of product features through customer feedback and proof-of-concept evaluations.

Requirements

  • Excellent verbal, written communication, and technical presentation skills in English.
  • MS/PhD or equivalent experience in Computer Science, Data Science, Electrical/Computer Engineering, Physics, Mathematics, or other Engineering fields.
  • 5+ years of work or research experience with Python/ C++ / other software development.
  • Work experience and knowledge of modern NLP including good understanding of transformer, state space, diffusion, MOE model architectures. This can include either expertise in training or optimization/compression/operation of DNNs.
  • Understanding of key libraries used for NLP/LLM training (such as Megatron-LN, NeMo, DeepSpeed etc.) and/or deployment (e.g., TensorRT-LLM, vLLM, Triton Inference Server).
  • Person excited to work with multiple levels and teams across organizations (Engineering, Product, Sales, and Marketing teams).
  • Capable of working in a constantly evolving environment without losing focus.
  • Self-starter with a demeanor for growth, passion for continuous learning, and sharing findings across the team.

Ways to Stand Out from The Crowd

  • Experience running/debugging large scale distributed DL training or inference.
  • Proven track record in optimizing neural network training performance and robustness, including implementing asynchronous checkpointing and optimizing CUDA kernels.
  • Understanding of HPC systems: data center design, high-speed interconnect InfiniBand, Cluster Storage and Scheduling related design and/or management experience.

Company Information

Widely considered to be one of the technology world’s most desirable employers, NVIDIA offers highly competitive salaries and a comprehensive benefits package. As you plan your future, see what we can offer to you and your family: www.nvidiabenefits.com

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees.

Skills

Deep Learning
Neural Network Inference
TRT-LLM
vLLM
SGLang
Systems Knowledge
GPU Architecture
Grace/ARM Systems
Large Scale Inference Pipelines
Artificial Intelligence
NVIDIA Grace CPUs
Grace-Hopper
Grace-Blackwell Systems
Chip-to-Chip NVLINK
KV Cache Offloading
Hybrid Models
Diffusion Models

NVIDIA

Designs GPUs and AI computing solutions

About NVIDIA

NVIDIA designs and manufactures graphics processing units (GPUs) and system on a chip units (SoCs) for various markets, including gaming, professional visualization, data centers, and automotive. Their products include GPUs tailored for gaming and professional use, as well as platforms for artificial intelligence (AI) and high-performance computing (HPC) that cater to developers, data scientists, and IT administrators. NVIDIA generates revenue through the sale of hardware, software solutions, and cloud-based services, such as NVIDIA CloudXR and NGC, which enhance experiences in AI, machine learning, and computer vision. What sets NVIDIA apart from competitors is its strong focus on research and development, allowing it to maintain a leadership position in a competitive market. The company's goal is to drive innovation and provide advanced solutions that meet the needs of a diverse clientele, including gamers, researchers, and enterprises.

Santa Clara, CaliforniaHeadquarters
1993Year Founded
$19.5MTotal Funding
IPOCompany Stage
Automotive & Transportation, Enterprise Software, AI & Machine Learning, GamingIndustries
10,001+Employees

Benefits

Company Equity
401(k) Company Match

Risks

Increased competition from AI startups like xAI could challenge NVIDIA's market position.
Serve Robotics' expansion may divert resources from NVIDIA's core GPU and AI businesses.
Integration of VinBrain may pose challenges and distract from NVIDIA's primary operations.

Differentiation

NVIDIA leads in AI and HPC solutions with cutting-edge GPU technology.
The company excels in diverse markets, including gaming, data centers, and autonomous vehicles.
NVIDIA's cloud services, like CloudXR, offer scalable solutions for AI and machine learning.

Upsides

Acquisition of VinBrain enhances NVIDIA's AI capabilities in the healthcare sector.
Investment in Nebius Group boosts NVIDIA's AI infrastructure and cloud platform offerings.
Serve Robotics' expansion, backed by NVIDIA, highlights growth in autonomous delivery services.

Land your dream remote job 3x faster with AI