Product Manager, Evaluation & Data Generation at Hippocratic AI

Palo Alto, California, United States

Apply Now

Not SpecifiedCompensation

Senior (5 to 8 years)Experience Level

Full TimeJob Type

UnknownVisa

Healthcare, Artificial IntelligenceIndustries

Requirements

3+ years in product management with experience in ML evaluation, labeling, or data pipelines
Familiarity with language model datasets, especially in high-stakes or regulated settings
Experience collaborating with labeling vendors, data QA teams, or managing Mechanical Turk-style pipelines
Attention to detail in process design and tooling for human-in-the-loop systems

Responsibilities

Define the strategy and architecture for model evaluation across agent behaviors
Collaborate with data scientists, ML engineers, and clinicians to craft robust benchmarks
Design and manage internal and external workflows for data labeling and generation
Monitor data quality and iterate on tooling and process efficiency
Work closely with the model training team to align data feedback loops with product performance

Skills

Key technologies and capabilities for this role

Product ManagementLLMModel EvaluationData GenerationAI SafetyHealthcare AIAgent BehaviorsData ScienceStrategyArchitecture

Questions & Answers

Common questions about this position

Is this role remote or onsite?

This is an onsite role requiring the team to be in the office five days a week in Palo Alto, CA.

What experience is required for this Product Manager role?

Candidates need 3+ years in product management with experience in ML evaluation, labeling, or data pipelines, plus familiarity with language model datasets in high-stakes settings and experience with labeling vendors or data QA teams.

What is the salary or compensation for this position?

This information is not specified in the job description.

What is the company culture like at Hippocratic AI?

The company values in-person teamwork, believing the best ideas happen together, and features a world-class team of experts in healthcare and AI with visionary leadership.

What makes a strong candidate for this role?

A strong candidate has 3+ years in product management focused on ML evaluation or data pipelines, experience with language model datasets in regulated environments, and skills in collaborating with labeling vendors and designing processes for human-in-the-loop systems.