Senior Inference Software Engineer at Etched.ai

San Jose, California, United States

Etched.ai Logo
$200,000 – $300,000Compensation
Senior (5 to 8 years)Experience Level
Full TimeJob Type
UnknownVisa
AI, Semiconductors, HardwareIndustries

Requirements

  • Proficiency in C++ or Rust
  • Understanding of performance-sensitive or complex distributed software systems like Linux internals, accelerator architectures (e.g. GPUs, TPUs), Compilers, or high-speed interconnects (e.g. NVLink, InfiniBand)
  • Familiarity with PyTorch or JAX
  • Ported applications to non-standard accelerator hardware or hardware platforms
  • (Strong candidates) Developed low-latency, high-performance applications using both kernel-level and user-space networking stacks
  • (Strong candidates) Deep understanding of distributed systems concepts, algorithms, and challenges, including consensus protocols, consistency models, and communication patterns
  • (Strong candidates) Solid grasp of Transformer architectures, particularly Mixture-of-Experts (MoE)
  • (Strong candidates) Built applications with extensive SIMD (Single Instruction, Multiple Data) optimizations for performance-critical paths

Responsibilities

  • Support porting state-of-the-art models to our architecture
  • Help build programming abstractions and testing capabilities to rapidly iterate on model porting
  • Build, enhance, and scale Sohu’s runtime, including multi-node inference, intra-node execution, state management, and robust error handling
  • Optimize routing and communication layers using Sohu’s collectives
  • Utilize performance profiling and debugging tools to identify bottlenecks and correctness issues

Skills

C++
Rust
PyTorch
JAX
distributed systems
GPUs
TPUs
compilers
NVLink
InfiniBand
SIMD
performance profiling
Transformers
Mixture-of-Experts

Etched.ai

Develops servers for transformer inference

About Etched.ai

The company specializes in developing powerful servers for transformer inference, utilizing transformer architecture integrated into their chips to achieve highly efficient and advanced technology. The main technologies used in the product are transformer architecture and advanced chip integration.

Cupertino, CA, USAHeadquarters
2022Year Founded
$5.4MTotal Funding
SEEDCompany Stage
HardwareIndustries
11-50Employees

Land your dream remote job 3x faster with AI