Senior Software QA Test Development Engineer - Diagnostics at NVIDIA

Santa Clara, California, United States

NVIDIA Logo
Not SpecifiedCompensation
Senior (5 to 8 years)Experience Level
Full TimeJob Type
UnknownVisa
Technology, AI, HPC, AutomotiveIndustries

Requirements

  • Bachelor’s Degree (or equivalent experience) in a STEM (Science, Technology, Engineering, Math or Physics) field
  • 5+ years proven experience; or master’s degree
  • Proven years of OS and server level automation, CI/CD process and DevOps experience using Python, SHELL, Ansible, Jenkins, C/C++, Java, JavaScript
  • Strong server and Linux (Ubuntu, RedHat, CentOS, SuSE, Fedora and etc.) troubleshooting and debugging experience in a bare-metal and KVM/VMWare/Hyper-V environment
  • Good knowledge and hands-on experience in model testing, AI tools/frameworks (TensorFlow, Pytorch, Cursor and etc.), NLP and LLM benchmarking
  • Experience in using AI development tools for test plans creation, test cases development and test cases automation
  • Strong experience in FW, BMC/OpenBMC, Network protocol, internal/external enterprise storage devices, PCIe buses and devices, IO sub-devices, CPU and memory, ACPI, UEFI spec, Redfish - huge plus
  • Proven years of experience in GitHub/Gitlab/Gerrit, PXE, SLURM, Stack/Kubernetes/Docker – huge plus
  • Enterprise server integration, strong Linux experience, reliability testing with various telemetries, scale out cluster, test plan development, track record in developing AI tools and NLP, DevOps, CI/CD experience
  • Thrives in a diverse work environment, has outstanding interpersonal skills and possesses a strong sense of engagement and continuous process improvement

Responsibilities

  • Responsible for the development and execution of NVIDIA HGX/DGX/MGX platform test plan on servers, OS, FW and CUDA SW stack from design doc
  • Installing and testing various systems OS, server firmware and SW stack
  • Drive support for root cause analysis on reliability and validation test failures to identify root cause(s) and achieve mitigation
  • Build, develop/debug server and OS level automation front-end and back-end framework and tests
  • Review partner and supplier test results and prescribe additional reliability testing on components, servers, and packaging as needed
  • Work in an agile software development team with very high production quality standards
  • Manage bug lifecycle and collaborate with inter-groups to drive for solutions

Skills

Linux
Reliability Testing
Test Plan Development
AI Tools
NLP
DevOps
CI/CD
Automation
CUDA
Server Integration
Root Cause Analysis
Agile

NVIDIA

Designs GPUs and AI computing solutions

About NVIDIA

NVIDIA designs and manufactures graphics processing units (GPUs) and system on a chip units (SoCs) for various markets, including gaming, professional visualization, data centers, and automotive. Their products include GPUs tailored for gaming and professional use, as well as platforms for artificial intelligence (AI) and high-performance computing (HPC) that cater to developers, data scientists, and IT administrators. NVIDIA generates revenue through the sale of hardware, software solutions, and cloud-based services, such as NVIDIA CloudXR and NGC, which enhance experiences in AI, machine learning, and computer vision. What sets NVIDIA apart from competitors is its strong focus on research and development, allowing it to maintain a leadership position in a competitive market. The company's goal is to drive innovation and provide advanced solutions that meet the needs of a diverse clientele, including gamers, researchers, and enterprises.

Santa Clara, CaliforniaHeadquarters
1993Year Founded
$19.5MTotal Funding
IPOCompany Stage
Automotive & Transportation, Enterprise Software, AI & Machine Learning, GamingIndustries
10,001+Employees

Benefits

Company Equity
401(k) Company Match

Risks

Increased competition from AI startups like xAI could challenge NVIDIA's market position.
Serve Robotics' expansion may divert resources from NVIDIA's core GPU and AI businesses.
Integration of VinBrain may pose challenges and distract from NVIDIA's primary operations.

Differentiation

NVIDIA leads in AI and HPC solutions with cutting-edge GPU technology.
The company excels in diverse markets, including gaming, data centers, and autonomous vehicles.
NVIDIA's cloud services, like CloudXR, offer scalable solutions for AI and machine learning.

Upsides

Acquisition of VinBrain enhances NVIDIA's AI capabilities in the healthcare sector.
Investment in Nebius Group boosts NVIDIA's AI infrastructure and cloud platform offerings.
Serve Robotics' expansion, backed by NVIDIA, highlights growth in autonomous delivery services.

Land your dream remote job 3x faster with AI