AI Infrastructure Solution Architect, Principal at d-Matrix

Santa Clara, California, United States

d-Matrix Logo
$175,000 – $260,000Compensation
Senior (5 to 8 years), Expert & Leadership (9+ years)Experience Level
Full TimeJob Type
UnknownVisa
Artificial Intelligence, Technology, HardwareIndustries

Requirements

  • Bachelor's or Master’s degree in Computer Science, or related technical field
  • 10+ years of experience in infrastructure solution architecture, systems management, DevOps, or platform engineering roles
  • Experience working with GPUs, custom AI accelerators or heterogeneous compute environments
  • Proven expertise in building, managing, and monitoring full-stack AI infrastructure at scale
  • Strong scripting/automation skills: Python, Bash, Ansible, Terraform, Helm, Docker/Kubernetes
  • Deep understanding of orchestration technologies (Kubernetes, Ray, KServe, etc.), containerization, server clusters, multi-tenant serving, etc
  • Experience with observability stacks (Prometheus, Grafana, OpenTelemetry, etc.)
  • Familiarity with model serving and orchestration platforms (e.g., Triton Inference Server, Ray Serve, Kubeflow)
  • Strong system debugging and incident response skills
  • Outstanding collaboration and communication skills

Responsibilities

  • Develop end-to-end AI infrastructure reference solutions optimized for d-Matrix servers including compute, networking, storage, and orchestration layers, in collaboration with various internal teams
  • Create reference blueprints that integrate smoothly into cloud-native and on-prem environments
  • Develop infrastructure-as-code templates and examples using Ansible, Terraform, and Helm for provisioning d-Matrix-based nodes and clusters
  • Integrate with Kubernetes-based systems to enable model deployment, auto-scaling, and fault-tolerant execution
  • Design and deploy telemetry and monitoring frameworks to support real-time visibility into d-Matrix cluster health, job status, and system performance
  • Integrate with industry-standard observability stacks (e.g., Prometheus, Grafana, OpenTelemetry) for data collection, visualization, and alerting
  • Develop dashboards, health check systems, and metric pipelines that track performance, availability, and operational KPIs
  • Collaborate with performance and software teams to validate infrastructure using real-world workloads and benchmarks
  • Incorporate telemetry hooks for benchmark reporting and feedback-driven tuning
  • Create and publish detailed infrastructure deployment guides, monitoring configuration templates, and operational best practices
  • Collaborate with customers and OEM/ISV ecosystem, enable them to adopt and customize reference solutions to their specific datacenter environments and/or software stacks

Skills

AI Infrastructure
Solution Architecture
Kubernetes
Terraform
Ansible
Helm
Infrastructure as Code
Telemetry
Monitoring
Gen AI Inference
Networking
Storage
Orchestration
Auto-scaling

d-Matrix

AI compute platform for datacenters

About d-Matrix

d-Matrix focuses on improving the efficiency of AI computing for large datacenter customers. Its main product is the digital in-memory compute (DIMC) engine, which combines computing capabilities directly within programmable memory. This design helps reduce power consumption and enhances data processing speed while ensuring accuracy. d-Matrix differentiates itself from competitors by offering a modular and scalable approach, utilizing low-power chiplets that can be tailored for different applications. The company's goal is to provide high-performance, energy-efficient AI inference solutions to large-scale datacenter operators.

Santa Clara, CaliforniaHeadquarters
2019Year Founded
$149.8MTotal Funding
SERIES_BCompany Stage
Enterprise Software, AI & Machine LearningIndustries
201-500Employees

Benefits

Hybrid Work Options

Risks

Competition from Nvidia, AMD, and Intel may pressure d-Matrix's market share.
Complex AI chip design could lead to delays or increased production costs.
Rapid AI innovation may render d-Matrix's technology obsolete if not updated.

Differentiation

d-Matrix's DIMC engine integrates compute into memory, enhancing efficiency and accuracy.
The company offers scalable AI solutions through modular, low-power chiplets.
d-Matrix focuses on brain-inspired AI compute engines for diverse inferencing workloads.

Upsides

Growing demand for energy-efficient AI solutions boosts d-Matrix's low-power chiplets appeal.
Partnerships with companies like Microsoft could lead to strategic alliances.
Increasing adoption of modular AI hardware in data centers benefits d-Matrix's offerings.

Land your dream remote job 3x faster with AI