d-Matrix

Machine Learning Engineer, Staff - Model Factory

Santa Clara, California, United States

$155,000 – $250,000Compensation
Mid-level (3 to 4 years), Senior (5 to 8 years)Experience Level
Full TimeJob Type
UnknownVisa
AI & Machine Learning, HardwareIndustries

Requirements

Candidates must possess strong programming skills in Python and have experience with machine learning frameworks such as PyTorch, TensorFlow, or JAX. Hands-on experience with model optimization, quantization, and inference acceleration is required, along with a deep understanding of Transformer architectures and distributed inference techniques. Knowledge of quantization methods and memory-efficient inference techniques is essential, as well as a solid grasp of software engineering best practices including CI/CD and containerization technologies like Docker and Kubernetes.

Responsibilities

The Machine Learning Engineer will design, build, and optimize machine learning deployment pipelines for large-scale models. They will implement and enhance model inference frameworks and develop automated workflows for model development, experimentation, and deployment. Collaboration with research, architecture, and engineering teams to improve model performance and efficiency is expected. The role also involves working with distributed computing frameworks to optimize model parallelism and deployment, implementing scalable KV caching and memory-efficient inference techniques, and monitoring and optimizing infrastructure performance across various levels of custom hardware.

Skills

Python
PyTorch
TensorFlow
JAX
Model Optimization
Quantization
Inference Acceleration
Transformer Architectures
Attention Mechanisms
Distributed Inference
Tensor Parallel
Pipeline Parallel
Sequence Parallel
KV Caching
Ray

d-Matrix

AI compute platform for datacenters

About d-Matrix

d-Matrix focuses on improving the efficiency of AI computing for large datacenter customers. Its main product is the digital in-memory compute (DIMC) engine, which combines computing capabilities directly within programmable memory. This design helps reduce power consumption and enhances data processing speed while ensuring accuracy. d-Matrix differentiates itself from competitors by offering a modular and scalable approach, utilizing low-power chiplets that can be tailored for different applications. The company's goal is to provide high-performance, energy-efficient AI inference solutions to large-scale datacenter operators.

Key Metrics

Santa Clara, CaliforniaHeadquarters
2019Year Founded
$149.8MTotal Funding
SERIES_BCompany Stage
Enterprise Software, AI & Machine LearningIndustries
201-500Employees

Benefits

Hybrid Work Options

Risks

Competition from Nvidia, AMD, and Intel may pressure d-Matrix's market share.
Complex AI chip design could lead to delays or increased production costs.
Rapid AI innovation may render d-Matrix's technology obsolete if not updated.

Differentiation

d-Matrix's DIMC engine integrates compute into memory, enhancing efficiency and accuracy.
The company offers scalable AI solutions through modular, low-power chiplets.
d-Matrix focuses on brain-inspired AI compute engines for diverse inferencing workloads.

Upsides

Growing demand for energy-efficient AI solutions boosts d-Matrix's low-power chiplets appeal.
Partnerships with companies like Microsoft could lead to strategic alliances.
Increasing adoption of modular AI hardware in data centers benefits d-Matrix's offerings.

Land your dream remote job 3x faster with AI