Senior System Software Engineer, Enterprise MODS at NVIDIA

Santa Clara, California, United States

NVIDIA Logo
Not SpecifiedCompensation
Senior (5 to 8 years), Expert & Leadership (9+ years)Experience Level
Full TimeJob Type
UnknownVisa
AI, HPC, Cloud Computing, Data CenterIndustries

Requirements

  • Proven experience architecting diagnostics for complex server systems, especially at the SW/HW interface
  • Deep systems knowledge: x86/ARM architectures, Linux/Windows OS internals, firmware (UEFI/BIOS), BMC, and platform security
  • Ability to weigh tradeoffs in system development and drive the most optimum solutions with customers and multi-disciplinary teams
  • Expertise in programming languages like C, C++, and Python for tool development and automation
  • Familiarity with high-speed interconnects such as PCIe, Infiniband, NVLink, and Ethernet
  • Strong communication skills to engage with technical and executive teams
  • BS/MS or equivalent experience in Computer Science, Electrical Engineering, or related field
  • 12+ years of engineering experience in diagnostics, embedded systems, or cloud platforms

Responsibilities

  • Develop diagnostic systems for NVIDIA data center platforms, which involve hardware and software tools to develop the worst case stress workloads for CPUs, GPUs, memory, storage, and interconnects
  • Lead platform bring-up and integration, ensuring diagnostics are embedded early and effectively across the server lifecycle
  • Drive hardware validation strategy in collaboration with architecture and hardware teams, crafting robust validation plans for new server generations
  • Analyze root causes of complex failures, acting as a Level 2 engineering contact for critical issues and offering scalable solutions across the stack
  • Develop diagnostics software to ensure quality and performance at scale across ODM and partner production lines
  • Mentor and grow engineering teams, providing technical leadership and encouraging a culture of innovation and excellence
  • Influence the long-term strategy by developing diagnostic architecture and roadmaps for the upcoming products of NVIDIA and its partners

Skills

Diagnostic Systems
Hardware Validation
Platform Bring-up
Stress Workloads
Root Cause Analysis
GPU Diagnostics
CPU Testing
Memory Diagnostics
Storage Testing
Interconnect Validation
ODM Integration
Server Platforms

NVIDIA

Designs GPUs and AI computing solutions

About NVIDIA

NVIDIA designs and manufactures graphics processing units (GPUs) and system on a chip units (SoCs) for various markets, including gaming, professional visualization, data centers, and automotive. Their products include GPUs tailored for gaming and professional use, as well as platforms for artificial intelligence (AI) and high-performance computing (HPC) that cater to developers, data scientists, and IT administrators. NVIDIA generates revenue through the sale of hardware, software solutions, and cloud-based services, such as NVIDIA CloudXR and NGC, which enhance experiences in AI, machine learning, and computer vision. What sets NVIDIA apart from competitors is its strong focus on research and development, allowing it to maintain a leadership position in a competitive market. The company's goal is to drive innovation and provide advanced solutions that meet the needs of a diverse clientele, including gamers, researchers, and enterprises.

Santa Clara, CaliforniaHeadquarters
1993Year Founded
$19.5MTotal Funding
IPOCompany Stage
Automotive & Transportation, Enterprise Software, AI & Machine Learning, GamingIndustries
10,001+Employees

Benefits

Company Equity
401(k) Company Match

Risks

Increased competition from AI startups like xAI could challenge NVIDIA's market position.
Serve Robotics' expansion may divert resources from NVIDIA's core GPU and AI businesses.
Integration of VinBrain may pose challenges and distract from NVIDIA's primary operations.

Differentiation

NVIDIA leads in AI and HPC solutions with cutting-edge GPU technology.
The company excels in diverse markets, including gaming, data centers, and autonomous vehicles.
NVIDIA's cloud services, like CloudXR, offer scalable solutions for AI and machine learning.

Upsides

Acquisition of VinBrain enhances NVIDIA's AI capabilities in the healthcare sector.
Investment in Nebius Group boosts NVIDIA's AI infrastructure and cloud platform offerings.
Serve Robotics' expansion, backed by NVIDIA, highlights growth in autonomous delivery services.

Land your dream remote job 3x faster with AI