Technical Customer Success Engineer at Baseten

San Francisco, California, United States

Apply Now

$150,000 – $225,000Compensation

Senior (5 to 8 years), Expert & Leadership (9+ years)Experience Level

Full TimeJob Type

UnknownVisa

AI, Machine Learning, TechnologyIndustries

Requirements

Deep Kubernetes troubleshooting expertise, including advanced resource debugging, pod/runtime analysis, and log-based diagnostics using observability tooling such as Grafana, Loki, and Prometheus
Strong infrastructure debugging ability across container orchestration, networking, and service dependencies, with hands-on experience supporting production-grade clusters
Experience managing high-severity incidents with major customers, including SLAs, post-incident reviews, and clear communication throughout escalations
Proven project management and organizational skills with an ownership mindset, able to manage multiple complex, multi-stakeholder initiatives in parallel — including issue resolution, root-cause analysis, and feature delivery
Ability to translate recurring technical pain points into roadmap-level insights, documentation improvements, or product enhancements
Strong communication skills and executive presence during high-visibility situations, ensuring technical clarity and customer confidence
3+ years of experience in a fast-paced, high-growth, or customer-facing engineering environment

Responsibilities

Diagnose and resolve runtime issues related to latency, memory behavior, GPU utilization, concurrency, and model lifecycle management
Debug infrastructure issues across Kubernetes (pods, controllers), networking, observability, and alerting systems
Lead incident response during outages or escalations, managing coordination between Product, FDE, Sales, and Engineering
Serve as the technical owner for top enterprise accounts with strict SLAs and high responsiveness expectations
Identify common failure modes and translate user feedback into roadmap signals, product improvements, our internal runbooks, knowledge bases, and diagnostic best practices
Own project coordination end-to-end: scoping, execution, communication, and stakeholder alignment across technical and non-technical teams ranging from feature requests, new deployments, and operational debugging issues

Skills

Key technologies and capabilities for this role

KubernetesGPU utilizationlatency debuggingmemory managementconcurrencymodel lifecycle managementnetworkingobservabilityalertingincident responseML workloadsAI model performance

Questions & Answers

Common questions about this position

What is the salary range for the Technical Customer Success Engineer position?

The salary range is $150K - $225K.

Is this a remote position or does it require office work?

This information is not specified in the job description.

What key skills are required for this role?

The role requires deep Kubernetes troubleshooting expertise including pod/runtime analysis and log-based diagnostics with Grafana, Loki, and Prometheus; strong infrastructure debugging across container orchestration, networking, and service dependencies; and experience managing high-severity incidents with SLAs and post-incident reviews.

What is the team structure like for this role?

You will partner closely with product, engineering, and forward-deployed teams, while coordinating across Product, FDE, Sales, and Engineering during incidents and projects.

What makes a strong candidate for this position?

A strong candidate has proven project management and organizational skills with an ownership mindset, able to manage multiple complex, multi-stakeholder initiatives including issue resolution and root-cause analysis.

Baseten

Platform for deploying and managing ML models

About Baseten

Baseten provides a platform for deploying and managing machine learning (ML) models, aimed at simplifying the process for businesses. Users can select from a library of open-source foundation models and deploy them with just two clicks, making it easier to implement ML solutions. The platform features autoscaling, which adjusts resources based on demand, and comprehensive monitoring tools for tracking performance and troubleshooting. A key differentiator is Baseten's open-source model packaging framework, Truss, which allows users to package and deploy custom models easily. The company operates on a usage-based pricing model, where clients pay only for the time their models are actively deployed, helping them manage costs effectively.

San Francisco, CaliforniaHeadquarters

2019Year Founded

$58.4MTotal Funding

SERIES_BCompany Stage

AI & Machine LearningIndustries

51-200Employees

Benefits

💰 Competitive compensation: We aim to provide 90th percentile (or better) salaries and equity grants for every team member commensurate with their experience.

🌎 Remote-first work environment: The Baseten team is welcome to work from wherever they want; fully remote, in our San Francisco office, or a mix of both. We provide a $1,000 stipend for you to make your home office comfortable and productive.

🏓 Regular in-person team summits: We get together as a team three times a year to plan, workshop, and most importantly, get to know each other better.

🌴 Unlimited PTO: We ask that everyone take at least 4 weeks of vacation. And we have a company-wide break between Christmas and New Year's Day.

🏥 Full healthcare coverage: Medical, dental and vision insurance for you and your family.

🍼 Paid parental leave: 16-weeks fully paid parental leave (adoptive and non-birth parents included) and flexibility with schedules while returning to work.

📈 401(k): Company-sponsored 401(k) for you to contribute to.

🧠: Learning and development budget: We encourage you to take classes, attend conferences, and invest in your craft and we’ll cover expenses to make it happen.

Risks

Increased competition from specialized AI models tailored for specific industries.

Potential over-reliance on Google Cloud Marketplace may limit flexibility and control.

Rapid AI model development could render Baseten's offerings obsolete without continuous innovation.

Differentiation

Baseten offers a serverless backend for machine-learning applications with auto-scaling.

Truss, an open-source model packaging framework, allows seamless deployment of custom models.

Baseten's platform provides comprehensive monitoring tools for efficient model performance tracking.

Upsides

Integration with Google Cloud Marketplace boosts visibility and customer acquisition potential.

$40M Series B funding enhances Baseten's platform capabilities and market reach.

Chains framework positions Baseten for complex AI workflows, attracting sophisticated projects.

Land your dream remote job 3x faster with AI

Try Jobo Free