Research Scientist/Engineer, Honesty at Anthropic

San Francisco, California, United States

Not SpecifiedCompensation

Senior (5 to 8 years)Experience Level

Full TimeJob Type

UnknownVisa

Artificial Intelligence, Machine LearningIndustries

Requirements

MS/PhD in Computer Science, ML, or related field
Strong programming skills in Python
Industry experience with language model finetuning and classifier training
Proficiency in experimental design and statistical analysis for measuring improvements in calibration and accuracy
Care about AI safety and the accuracy and honesty of both current and future AI systems
Experience in data science or the creation and curation of datasets for finetuning LLMs
An understanding of various metrics of uncertainty, calibration, and truthfulness in model outputs

Responsibilities

Design and implement novel data curation pipelines to identify, verify, and filter training data for accuracy given the model’s knowledge
Develop specialized classifiers to detect potential hallucinations or miscalibrated claims made by the model
Create and maintain comprehensive honesty benchmarks and evaluation frameworks
Implement search and retrieval-augmented generation (RAG) systems to ground model outputs in verified information
Design and deploy human feedback collection specifically for identifying and correcting miscalibrated responses
Design and implement prompting pipelines to generate data that improves model accuracy and honesty
Develop and test novel RL environments that reward truthful outputs and penalize fabricated claims
Create tools to help human evaluators efficiently assess model outputs for accuracy

Skills

Key technologies and capabilities for this role

PythonLLM FinetuningClassifier TrainingExperimental DesignStatistical AnalysisData CurationDataset CreationAI SafetyCalibration MetricsUncertainty MetricsTruthfulness Metrics

Questions & Answers

Common questions about this position

Is this position remote?

Yes, this is a remote position.

What is the salary for this role?

This information is not specified in the job description.

What are the key required skills for this position?

Key requirements include strong programming skills in Python, industry experience with language model finetuning and classifier training, proficiency in experimental design and statistical analysis, and experience in data science or dataset curation for finetuning LLMs.

What is the company culture like at Anthropic?

Anthropic has a quickly growing team of committed researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems, with a strong emphasis on AI safety, reliability, and societal benefit.

What makes a strong candidate for this role?

Strong candidates have an MS/PhD in Computer Science, ML, or related field, care about AI safety and model honesty, and may also have published work on hallucination prevention, experience with RAG, confidence estimation methods, or RLHF.

Anthropic

Develops reliable and interpretable AI systems

About Anthropic

Anthropic focuses on creating reliable and interpretable AI systems. Its main product, Claude, serves as an AI assistant that can manage tasks for clients across various industries. Claude utilizes advanced techniques in natural language processing, reinforcement learning, and code generation to perform its functions effectively. What sets Anthropic apart from its competitors is its emphasis on making AI systems that are not only powerful but also understandable and controllable by users. The company's goal is to enhance operational efficiency and improve decision-making for its clients through the deployment and licensing of its AI technologies.

San Francisco, CaliforniaHeadquarters

2021Year Founded

$11,482.1MTotal Funding

GROWTH_EQUITY_VCCompany Stage

Enterprise Software, AI & Machine LearningIndustries

1,001-5,000Employees

Benefits

Flexible Work Hours

Paid Vacation

Parental Leave

Hybrid Work Options

Company Equity

Risks

Ongoing lawsuit with Concord Music Group could lead to financial liabilities.

Technological lag behind competitors like OpenAI may impact market position.

Reliance on substantial funding rounds may indicate financial instability.

Differentiation

Anthropic focuses on AI safety, contrasting with competitors' commercial priorities.

Claude, Anthropic's AI assistant, is designed for tasks of any scale.

Partnerships with tech giants like Panasonic and Amazon enhance Anthropic's strategic positioning.

Upsides

Anthropic's $60 billion valuation reflects strong investor confidence and growth potential.

Collaborations like the Umi app with Panasonic tap into the growing wellness AI market.

Focus on AI safety aligns with increasing industry emphasis on ethical AI development.

Land your dream remote job 3x faster with AI

Try Jobo Free