Director of Production Engineering at CoreWeave

Livingston, New Jersey, United States

CoreWeave Logo
Not SpecifiedCompensation
Expert & Leadership (9+ years)Experience Level
Full TimeJob Type
UnknownVisa
AI, Cloud Computing, TechnologyIndustries

Requirements

  • Bachelor’s degree in Computer Science, Engineering, or related fields

Responsibilities

  • Define and execute the SRE vision, strategy, and roadmap for a large-scale, distributed cloud infrastructure
  • Lead and mentor a high-performing team of SREs, promoting a culture of ownership, collaboration, and continuous learning
  • Champion automation-first practices, leveraging tools like Terraform, Kubernetes, and Infrastructure-as-Code to minimize toil and manual interventions
  • Establish and evolve best practices in observability, monitoring, and alerting, ensuring the platform is proactive, not reactive
  • Drive initiatives for incident management, postmortem culture, root cause analysis, and system hardening
  • Collaborate with engineering, product, and customer support teams to build scalable, resilient, and self-healing systems
  • Evolve our on-call strategy and processes to support a 24x7, globally distributed platform with minimal disruptions

Skills

Key technologies and capabilities for this role

SRECloud InfrastructureDistributed SystemsAutomationScalabilityReliability EngineeringOperational ExcellenceLeadershipTeam Mentoring

Questions & Answers

Common questions about this position

What is the salary for the Director of Production Engineering role?

This information is not specified in the job description.

Is this position remote or hybrid?

The position is hybrid.

What technical skills are required for this role?

Key skills include experience with Terraform, Kubernetes, and Infrastructure-as-Code, along with expertise in observability, monitoring, alerting, incident management, and automation practices.

What is the company culture like at CoreWeave?

CoreWeave thrives in a dynamic environment that values adaptability, resilience, operational excellence, automation, ownership, collaboration, and continuous learning.

What makes a strong candidate for this Director role?

A strong candidate is a thoughtful leader with technical depth, strategic vision, and the ability to thrive in fast-moving, high-growth environments, valuing clarity over complexity, mentorship over management, and resilience over rigidity.

CoreWeave

Cloud service for GPU-accelerated workloads

About CoreWeave

CoreWeave provides cloud computing services that focus on GPU-accelerated workloads, which are essential for tasks requiring high computational power. Their services cater to industries such as artificial intelligence, machine learning, visual effects rendering, and data processing. Clients can access powerful computing resources on a pay-as-you-go basis, allowing them to avoid the costs of purchasing expensive hardware. CoreWeave's infrastructure utilizes a bare metal serverless Kubernetes platform, which enhances performance while minimizing operational complexity for users. This setup enables clients to optimize their computing needs with a variety of NVIDIA GPUs, ensuring they can balance performance and cost effectively. The company's goal is to offer flexible and scalable computing solutions that meet the demands of diverse clients, from tech companies to film studios.

New York City, New YorkHeadquarters
2017Year Founded
$1,625.4MTotal Funding
SECONDARYCompany Stage
Enterprise Software, AI & Machine LearningIndustries
501-1,000Employees

Benefits

Health Insurance
Dental Insurance
Vision Insurance
Life Insurance
Disability Insurance
Health Savings Account/Flexible Spending Account
Tuition Reimbursement
Mental Health Support
Family Planning Benefits
Paid Parental Leave
Hybrid Work Options
401(k) Company Match
Unlimited Paid Time Off
Catered lunch each day in our office and data center locations
A casual work environment

Risks

Increased competition from Vultr, backed by AMD, may impact CoreWeave's market share.
The planned IPO in 2025 could expose CoreWeave to market volatility and scrutiny.
Reliance on NVIDIA GPUs poses risks if supply chain issues arise.

Differentiation

CoreWeave specializes in GPU-accelerated workloads, catering to AI and high-performance computing.
Their infrastructure uses a bare metal serverless Kubernetes platform for high performance.
CoreWeave offers a flexible pay-as-you-go pricing model, appealing to cost-conscious clients.

Upsides

CoreWeave's partnership with Dell enhances infrastructure, attracting clients seeking advanced AI solutions.
The $600M data center funding in Virginia supports CoreWeave's expansion in the AI cloud market.
Experienced leadership, like CIO Sandy Venugopal, positions CoreWeave for strategic growth.

Land your dream remote job 3x faster with AI