ML Compiler Architect, Senior Principal at d-Matrix

Toronto, Ontario, Canada

d-Matrix Logo
Not SpecifiedCompensation
Senior (5 to 8 years), Expert & Leadership (9+ years)Experience Level
Full TimeJob Type
UnknownVisa
Artificial Intelligence, SemiconductorsIndustries

Requirements

  • BS with 15+ years / MS with 12+ years / PhD with 10+ years in Computer Science or Electrical Engineering
  • 12+ years of experience in Front End Compiler and systems software development, focused on ML inference
  • Deep experience in designing or leading compiler efforts using MLIR, LLVM, Torch-MLIR, or similar frameworks
  • Strong understanding of model optimization for inference: quantization, fusion, tensor layout transformation, memory hierarchy utilization, and scheduling
  • Expertise in deploying ML models to heterogeneous compute environments, with attention to latency, throughput, and resource scaling in cloud systems
  • Proven track record working with AI frameworks (e.g., PyTorch, TensorFlow), ONNX, and hardware backends
  • Experience with cloud infrastructure, including resource provisioning, distributed execution, and profiling tools
  • Preferred Qualifications
  • Experience targeting inference accelerators (AI ASICs, FPGAs, GPUs) in cloud-scale environments

Responsibilities

  • Architect the MLIR-based compiler for cloud inference workloads, focusing on efficient mapping of large-scale AI models (e.g., LLMs, Transformers, Torch-MLIR) onto distributed compute and memory hierarchies
  • Lead the development of compiler passes for model partitioning, operator fusion, tensor layout optimization, memory tiling, and latency-aware scheduling
  • Design support for hybrid offline/online compilation and deployment flows with runtime-aware mapping, allowing for adaptive resource utilization and load balancing in cloud scenarios
  • Define compiler abstractions that interoperate efficiently with runtime systems, orchestration layers, and cloud deployment frameworks
  • Drive scalability, reproducibility, and performance through well-designed IR transformations and distributed execution strategies
  • Mentor and guide a team of compiler engineers to deliver high-performance inference-optimized software stacks

Skills

Key technologies and capabilities for this role

MLIRLLVMCompiler ArchitectureAI InferenceTransformer ModelsNLPHeterogeneous ComputeDynamic PartitioningWorkload SchedulingCloud DeploymentTensor ProcessorsVector Engines

Questions & Answers

Common questions about this position

What is the work arrangement or location policy for this role?

The position is hybrid, requiring onsite work at the Toronto, Ontario, Canada headquarters 3-5 days per week.

What is the salary for this position?

This information is not specified in the job description.

What key skills and expertise are required for this role?

The role requires expertise in MLIR-based compiler architecture for cloud inference, development of compiler passes for model partitioning, operator fusion, tensor layout optimization, memory tiling, and latency-aware scheduling, plus experience with large-scale AI models like LLMs and Transformers.

What is the company culture like at d-Matrix?

The culture emphasizes respect and collaboration, valuing humility, direct communication, inclusivity, and diverse perspectives for better solutions.

What makes a strong candidate for this ML Compiler Architect role?

Strong candidates are passionate about tackling challenges, driven by execution, with hands-on expertise in MLIR/LLVM compilers for AI inference, and leadership skills to mentor compiler engineers.

d-Matrix

AI compute platform for datacenters

About d-Matrix

d-Matrix focuses on improving the efficiency of AI computing for large datacenter customers. Its main product is the digital in-memory compute (DIMC) engine, which combines computing capabilities directly within programmable memory. This design helps reduce power consumption and enhances data processing speed while ensuring accuracy. d-Matrix differentiates itself from competitors by offering a modular and scalable approach, utilizing low-power chiplets that can be tailored for different applications. The company's goal is to provide high-performance, energy-efficient AI inference solutions to large-scale datacenter operators.

Santa Clara, CaliforniaHeadquarters
2019Year Founded
$149.8MTotal Funding
SERIES_BCompany Stage
Enterprise Software, AI & Machine LearningIndustries
201-500Employees

Benefits

Hybrid Work Options

Risks

Competition from Nvidia, AMD, and Intel may pressure d-Matrix's market share.
Complex AI chip design could lead to delays or increased production costs.
Rapid AI innovation may render d-Matrix's technology obsolete if not updated.

Differentiation

d-Matrix's DIMC engine integrates compute into memory, enhancing efficiency and accuracy.
The company offers scalable AI solutions through modular, low-power chiplets.
d-Matrix focuses on brain-inspired AI compute engines for diverse inferencing workloads.

Upsides

Growing demand for energy-efficient AI solutions boosts d-Matrix's low-power chiplets appeal.
Partnerships with companies like Microsoft could lead to strategic alliances.
Increasing adoption of modular AI hardware in data centers benefits d-Matrix's offerings.

Land your dream remote job 3x faster with AI