Crusoe

Senior Site Reliability Engineer, Compute

San Francisco, California, United States

Not SpecifiedCompensation
Senior (5 to 8 years)Experience Level
Full TimeJob Type
UnknownVisa
Cloud Computing, Computer Systems, Artificial IntelligenceIndustries

Requirements

Candidates should possess 5+ years of professional experience in Site Reliability Engineering, Linux system engineering, or compute infrastructure roles, demonstrating strong proficiency in Linux kernel internals, including scheduler, memory allocation, and driver subsystems. Experience with virtualization architectures and technologies such as KVM, Xen, QEMU, or VMware is required, along with familiarity with SmartNICs/DPUs (e.g., NVIDIA CX6/7, BlueField-3) and kernel bypass techniques. Expert-level skills in at least one programming language, such as C, Go, or Rust, are also necessary.

Responsibilities

As a Senior Site Reliability Engineer, you will develop automation and observability tools to monitor Crusoe’s compute infrastructure, spanning from the kernel to orchestration layers, and support and scale the company’s virtualization stack, including technologies such as KVM, QEMU, and other hypervisors. You will collaborate with Linux kernel and hardware teams to identify and resolve performance bottlenecks, driver issues, and optimize hardware offloads, focusing on optimizing performance for AI and HPC workloads across CPU, GPU, and DPU/NIC resources. Additionally, you will participate in root cause analysis for kernel crashes, hardware-software integration problems, and performance regressions, while integrating hypervisor-level enhancements to improve guest VM reliability and workload isolation, and tune kernel subsystems such as the process scheduler, NUMA configuration, memory management, and interrupt handling, working closely with platform teams to implement and validate support for emerging compute hardware.

Skills

Linux kernel
Linux system engineering
KVM
Xen
QEMU
VMware
SmartNICs
DPUs
NVIDIA CX6
NVIDIA CX7
BlueField-3
kernel bypass
C
Go
Rust
Scheduler
Memory Allocation
Driver Subsystems
Hypervisors
Process Scheduler
NUMA Configuration
Memory Management
Interrupt Handling

Crusoe

Utilizes wasted energy for computing power

About Crusoe

Crusoe Energy Systems Inc. provides digital infrastructure that focuses on using wasted, stranded, or clean energy sources to power high-performance computing and artificial intelligence. The company helps clients in the technology and energy sectors by offering scalable computing solutions that aim to reduce greenhouse gas emissions and support the transition to cleaner energy. Crusoe's approach involves converting excess natural gas and renewable energy into computing power, which allows them to maximize resource efficiency while minimizing environmental impact. Unlike many competitors, Crusoe specifically targets the intersection of energy and technology, generating revenue by supplying computing resources to enterprises that need significant computational power for applications like AI and machine learning, along with providing technical support.

Key Metrics

Denver, ColoradoHeadquarters
2018Year Founded
$1,082.2MTotal Funding
SERIES_DCompany Stage
Energy, AI & Machine LearningIndustries
201-500Employees

Benefits

Industry competitive pay
Health insurance package options that include HDHP and PPO, vision, and dental for you and your dependents
Paid life insurance, short-term and long-term disability
Parental leave
Stock options in a fast-growing, well-funded technology company
Pet-friendly offices
Teladoc
401(k) with a 4% match
Unlimited time off
Cell phone reimbursement
Tuition reimbursement
Company paid commuter benefit; $100 per month
Calm

Risks

Increased competition in AI infrastructure could threaten Crusoe's market share.
Regulatory scrutiny may arise from Bitcoin mining's environmental concerns.
Rapid expansion into AI infrastructure may lead to operational challenges.

Differentiation

Crusoe converts wasted energy into computing power, reducing environmental impact.
The company offers scalable solutions for AI and high-performance computing needs.
Crusoe's Digital Flare Mitigation technology utilizes natural gas for eco-friendly Bitcoin mining.

Upsides

Crusoe secured $600M in Series D funding, boosting AI infrastructure expansion.
Partnerships with tech firms enhance Crusoe's AI capabilities and market reach.
AI-driven energy optimization can significantly reduce operational costs in data centers.

Land your dream remote job 3x faster with AI