Machine Learning Co-Design Researcher at Etched.ai

San Jose, California, United States

Etched.ai Logo
Not SpecifiedCompensation
Junior (1 to 2 years)Experience Level
Full TimeJob Type
UnknownVisa
Artificial Intelligence, Semiconductor, HardwareIndustries

Requirements

Candidates should possess co-design expertise across both software and hardware domains, strong software engineering skills with systems programming experience, and a deep knowledge of transformer model architectures and/or inference serving stacks such as vLLM and SGLang. Strong mathematical skills, particularly in linear algebra, are required, along with the ability to reason about performance bottlenecks and optimization opportunities. Experience working cross-functionally in diverse software and hardware organizations is also beneficial.

Responsibilities

The Machine Learning Co-Design Researcher will translate core mathematical operations from transformer models into optimized operation sequences for Sohu, develop and leverage a deep understanding of Sohu to co-design both hardware instructions and model architecture operations, and implement high-performance software components for the Model Toolkit. They will collaborate with hardware engineers to maximize chip utilization and minimize latency, implement efficient batching strategies and execution plans for inference workloads, design and implement cutting-edge inference time compute scaling methods, alter and fine-tune model architectures or inference time compute algorithms, and contribute to the evolution of the company's system architecture and programming model. Specific tasks include optimizing operation sequences to maximize Sohu’s computational resources, researching and implementing efficient memory management techniques, developing algorithms for continuous batching and batch interleaving, and researching and implementing model-specific inference-time acceleration algorithms such as speculative decoding and KV cache sharing.

Skills

Transformer model architectures
Inference serving stacks
Systems programming
Linear algebra
Performance optimization
Software engineering
Mathematical skills
Hardware design
Batching strategies
Memory management

Etched.ai

Develops servers for transformer inference

About Etched.ai

The company specializes in developing powerful servers for transformer inference, utilizing transformer architecture integrated into their chips to achieve highly efficient and advanced technology. The main technologies used in the product are transformer architecture and advanced chip integration.

Cupertino, CA, USAHeadquarters
2022Year Founded
$5.4MTotal Funding
SEEDCompany Stage
HardwareIndustries
11-50Employees

Land your dream remote job 3x faster with AI