[Remote] Senior Solutions Architect - Networking at NVIDIA

Washington, United States

NVIDIA Logo
Not SpecifiedCompensation
Senior (5 to 8 years)Experience Level
Full TimeJob Type
UnknownVisa
BiotechnologyIndustries

Requirements

  • BS/MS/PhD or equivalent experience in Computer Science, Electrical/Computer Engineering, Physics, Mathematics, or related fields
  • 8+ years of professional experience in networking fundamentals, TCP/IP stack, InfiniBand fundamentals and data center architecture
  • Proficiency in configuring, testing, validating, and resolving issues in Ethernet and InfiniBand networks, especially in medium to large-scale HPC/AI environments
  • Advanced knowledge of HPC/AI networking protocols
  • Hands-on experience with network switch/router platforms like Cumulus Linux, SONiC, IOS, JunosOS, and EOS
  • Strong focus on customer needs and satisfaction
  • Self-motivated with leadership skills to work collaboratively with customers and internal teams
  • Strong written, verbal, and listening skills
  • Familiarity with cloud networks (AWS, GCP, Azure) is a plus
  • Knowledge in link level performance and diagnostics
  • Experience with High-performance computing architectures
  • Experience with GPU (Graphics Processing Unit) focused hardware/software
  • Linux or Networking Certifications
  • Ability to work in a dynamic customer-focused team
  • Excellent interpersonal skills

Responsibilities

  • Build AI/HPC infrastructure for a large CSP customer and their endusers
  • Support operational and reliability aspects of large-scale AI clusters
  • Focus on performance at scale, real-time monitoring, logging, and alerting
  • Engage in and improve the whole lifecycle of services—from inception and design through deployment, operation, and refinement
  • Maintain services once they are live by measuring and monitoring availability, latency, and overall system health
  • Provide feedback to internal teams such as opening bugs, documenting workarounds, drive customer feature requirements and suggesting improvements
  • Collaborate with customers and internal teams to analyze, define, and implement large-scale Networking projects
  • Analyze and optimize network configurations for improved performance and reliability
  • Troubleshoot and resolve network issues in medium to large-scale HPC/AI environments
  • Develop and implement automation scripts to improve network efficiency and reduce manual labor
  • Stay up-to-date with the latest networking technologies and protocols
  • Participate in project planning, design, and implementation
  • Collaborate with cross-functional teams to ensure successful project delivery
  • Develop and maintain documentation for network configurations, troubleshooting, and maintenance
  • Ensure compliance with NVIDIA's security and data protection policies

Skills

Networking fundamentals
TCP/IP stack
InfiniBand fundamentals
Data center architecture
Ethernet
HPC networking protocols
System Design
Automation
Performance at scale
Real-time monitoring
Logging
Alerting

NVIDIA

Designs GPUs and AI computing solutions

About NVIDIA

NVIDIA designs and manufactures graphics processing units (GPUs) and system on a chip units (SoCs) for various markets, including gaming, professional visualization, data centers, and automotive. Their products include GPUs tailored for gaming and professional use, as well as platforms for artificial intelligence (AI) and high-performance computing (HPC) that cater to developers, data scientists, and IT administrators. NVIDIA generates revenue through the sale of hardware, software solutions, and cloud-based services, such as NVIDIA CloudXR and NGC, which enhance experiences in AI, machine learning, and computer vision. What sets NVIDIA apart from competitors is its strong focus on research and development, allowing it to maintain a leadership position in a competitive market. The company's goal is to drive innovation and provide advanced solutions that meet the needs of a diverse clientele, including gamers, researchers, and enterprises.

Santa Clara, CaliforniaHeadquarters
1993Year Founded
$19.5MTotal Funding
IPOCompany Stage
Automotive & Transportation, Enterprise Software, AI & Machine Learning, GamingIndustries
10,001+Employees

Benefits

Company Equity
401(k) Company Match

Risks

Increased competition from AI startups like xAI could challenge NVIDIA's market position.
Serve Robotics' expansion may divert resources from NVIDIA's core GPU and AI businesses.
Integration of VinBrain may pose challenges and distract from NVIDIA's primary operations.

Differentiation

NVIDIA leads in AI and HPC solutions with cutting-edge GPU technology.
The company excels in diverse markets, including gaming, data centers, and autonomous vehicles.
NVIDIA's cloud services, like CloudXR, offer scalable solutions for AI and machine learning.

Upsides

Acquisition of VinBrain enhances NVIDIA's AI capabilities in the healthcare sector.
Investment in Nebius Group boosts NVIDIA's AI infrastructure and cloud platform offerings.
Serve Robotics' expansion, backed by NVIDIA, highlights growth in autonomous delivery services.

Land your dream remote job 3x faster with AI