TLM, AI Evaluation Science at Diligent Robotics

Wake Forest, North Carolina, United States

Diligent Robotics Logo
Not SpecifiedCompensation
Expert & Leadership (9+ years)Experience Level
Full TimeJob Type
UnknownVisa
Robotics, Artificial IntelligenceIndustries

Requirements

  • MS or PhD in Computer Science, Robotics, ML, EE, or related field along with 8+ years of AI/ML experience
  • Proven leadership experience: built and managed technical teams in AI, simulation, or robotics evaluation
  • Hands-on expertise building and evaluating large multimodal ML models (vision, language, action)
  • Strong background in defining and operationalizing metrics for AI/robotics systems (safety, robustness, reliability)
  • Demonstrated success in designing end-to-end evaluation pipelines: from data labeling and test definition to automated reporting and regression tracking
  • Experience in evaluation, benchmarking, or safety in robotics, AVs, or similar domains
  • Experience with simulation platforms for robotics or AVs
  • Technical depth in ML interpretability, error analysis, and data-driven model improvement
  • Ability to operate in a startup context: strategic, but hands-on in code and experimentation
  • Excellent communication and cross-functional alignment skills—able to articulate risks, metrics, and trade-offs to executives, engineers, and non-technical stakeholders

Responsibilities

  • Lead the AI Evaluation Science team, owning evaluation strategy for robot perception, planning, control, and multimodal models
  • Define metrics and benchmarks for AI performance across safety, reliability, user experience, and robustness
  • Develop and maintain large-scale simulation environments to test robot behaviors under diverse real-world conditions (edge cases, adversarial scenarios, rare failures)
  • Design evaluation frameworks that cover offline experiments, simulation, and live deployments
  • Build scalable pipelines for test coverage, automated evaluation, and regression tracking
  • Oversee labeling and data curation pipelines to generate high-quality ground truth for training and validation
  • Drive interpretability and explainability in embodied AI models—ensuring failures are measurable, diagnosable, and improvable
  • Collaborate closely with AI/Robotics engineering teams to define product requirements, set acceptance thresholds, and close the loop between evaluation and development
  • Actively mentor engineers and scientists while contributing hands-on to code, experiments, and metrics design

Skills

Key technologies and capabilities for this role

AI EvaluationRoboticsSimulation EnvironmentsMetricsBenchmarksPerceptionPlanningControlMultimodal ModelsData LabelingData CurationInterpretabilityExplainabilityEvaluation PipelinesRegression Tracking

Questions & Answers

Common questions about this position

What education and experience are required for the TLM, AI Evaluation Science role?

Candidates need an MS or PhD in Computer Science, Robotics, ML, EE, or related field, along with 8+ years of AI/ML experience, proven leadership in building technical teams, and hands-on expertise in multimodal ML models and evaluation pipelines.

Is this a remote position or does it require on-site work?

This information is not specified in the job description.

What is the salary or compensation for this role?

This information is not specified in the job description.

What does the company culture look like at Diligent Robotics?

It's a mission-driven, venture-backed startup environment focused on building humanoid robots that collaborate with humans, with a hands-on leadership role as a strategist and player-coach.

What makes a strong candidate for this TLM, AI Evaluation Science position?

A strong candidate has proven leadership in AI or robotics teams, expertise in defining metrics for safety and reliability, and experience building end-to-end evaluation pipelines including simulation and data labeling.

Diligent Robotics

Develops robots to assist healthcare staff

About Diligent Robotics

Diligent Robotics develops robots like Moxi to assist hospital staff with routine tasks, allowing healthcare professionals to focus more on patient care. Moxi can perform activities such as delivering lab samples and supplies within hospitals, which can save staff up to 30% of their workday. This helps improve operational efficiency and enhances the patient experience. The company uses an A.I. framework that includes social intelligence and human-guided learning to enable Moxi to navigate hospital environments effectively. Unlike competitors, Diligent Robotics focuses specifically on healthcare, providing tailored robotic solutions that integrate seamlessly into hospital workflows. The goal is to transform healthcare by improving staff productivity and patient care through the use of robotics.

Austin, TexasHeadquarters
2017Year Founded
$68.9MTotal Funding
LATE_VCCompany Stage
AI & Machine Learning, HealthcareIndustries
51-200Employees

Benefits

Competitive salary & equity
Competitive health plan coverage

Risks

Increased competition from well-funded robotics companies in the USA, China, and Israel.
Maintaining a competitive edge in AI innovation is a constant challenge.
Gender disparity in robotics may impact talent acquisition and retention.

Differentiation

Diligent Robotics' Moxi robot specializes in hospital logistics, enhancing staff efficiency.
Moxi's social intelligence and human-guided learning set it apart in healthcare robotics.
Diligent Robotics focuses on operational efficiency, aligning with value-based care models.

Upsides

Diligent Robotics raised $25 million to expand Moxi's reach in hospitals.
Moxi completed 110,000 autonomous elevator rides, showcasing advanced mobility capabilities.
Growing demand for robotic solutions in healthcare boosts Diligent Robotics' market potential.

Land your dream remote job 3x faster with AI