Inference Software Engineer at Etched.ai

San Jose, California, United States

Etched.ai Logo
$175,000 – $275,000Compensation
Junior (1 to 2 years)Experience Level
Full TimeJob Type
UnknownVisa
Artificial Intelligence, Semiconductor, SoftwareIndustries

Requirements

Candidates should possess proficiency in Rust and/or C++, along with a good familiarity with PyTorch and/or JAX. They should also have experience porting applications to non-standard or accelerator hardware platforms, and a solid systems knowledge including Linux internals, accelerator architectures (e.g., GPUs, TPUs), and high-speed interconnects (e.g., NVLink, InfiniBand). Strong candidates may have experience with low-latency, high-performance applications using both kernel-level and user-space networking stacks, and a deep understanding of distributed systems concepts, algorithms, and challenges.

Responsibilities

The Inference Software Engineer will support porting state-of-the-art models to the company’s architecture, help build programming abstractions and testing capabilities to rapidly iterate on model porting, scale and enhance Sohu’s runtime, including multi-node inference, intra-node execution, state management, and robust error handling, and optimize routing and communication layers using Sohu’s collectives. They will also develop tools for performance profiling and debugging, identifying bottlenecks and correctness issues, and may be involved in analyzing performance traces and logs from distributed systems and ML workloads, building applications with extensive SIMD (Single Instruction, Multiple Data) optimizations for performance-critical paths, and familiarizing themselves with cluster orchestration tools (e.g., Kubernetes, Slurm) and ML platforms (e.g., Ray, Kubeflow).

Skills

Rust
C++
PyTorch
JAX
transformers
Linux
Accelerator Architectures
NVLink
InfiniBand
Performance Profiling
Debugging
Systems Knowledge

Etched.ai

Develops servers for transformer inference

About Etched.ai

The company specializes in developing powerful servers for transformer inference, utilizing transformer architecture integrated into their chips to achieve highly efficient and advanced technology. The main technologies used in the product are transformer architecture and advanced chip integration.

Cupertino, CA, USAHeadquarters
2022Year Founded
$5.4MTotal Funding
SEEDCompany Stage
HardwareIndustries
11-50Employees

Land your dream remote job 3x faster with AI