Research Engineer / Scientist, Model Welfare at Anthropic

San Francisco, California, United States

Anthropic Logo
Not SpecifiedCompensation
Senior (5 to 8 years)Experience Level
Full TimeJob Type
UnknownVisa
Artificial Intelligence, TechnologyIndustries

Requirements

  • Significant applied software, ML, or research engineering experience
  • Experience contributing to empirical AI research projects and/or technical AI safety research
  • Ability to reliably turn abstract theories into creative, tractable research hypotheses and experiments
  • Preference to move fast and iterate rather than run long extensive projects
  • Excitement to dive into new technical areas on a regular basis
  • Care about the possible impacts of AI development on humans and the AI systems themselves
  • At least a Bachelor's degree in a related field or equivalent experience

Responsibilities

  • Run technical research projects to investigate model characteristics of plausible relevance to welfare, consciousness, or related properties
  • Design and implement low-cost interventions to mitigate the risk of welfare harms
  • Collaborate with other teams, including Interpretability, Finetuning, Alignment Science, and Safeguards
  • Investigate and improve the reliability of introspective self-reports from models
  • Collaborate with Interpretability to explore potentially welfare-relevant features and circuits
  • Improve and expand welfare assessments for future frontier models
  • Evaluate the presence of potentially welfare-relevant capabilities and characteristics as a function of model scale
  • Develop strategies for making high-trust/verifiable commitments to models
  • Explore possible interventions and deploy them into production (e.g., allowing models to end harmful or distressing interactions)

Skills

Machine Learning
AI Interpretability
AI Safety
AI Ethics
Model Evaluation
Finetuning
Neural Circuits
Self-Reports
Welfare Assessments
Alignment

Anthropic

Develops reliable and interpretable AI systems

About Anthropic

Anthropic focuses on creating reliable and interpretable AI systems. Its main product, Claude, serves as an AI assistant that can manage tasks for clients across various industries. Claude utilizes advanced techniques in natural language processing, reinforcement learning, and code generation to perform its functions effectively. What sets Anthropic apart from its competitors is its emphasis on making AI systems that are not only powerful but also understandable and controllable by users. The company's goal is to enhance operational efficiency and improve decision-making for its clients through the deployment and licensing of its AI technologies.

San Francisco, CaliforniaHeadquarters
2021Year Founded
$11,482.1MTotal Funding
GROWTH_EQUITY_VCCompany Stage
Enterprise Software, AI & Machine LearningIndustries
1,001-5,000Employees

Benefits

Flexible Work Hours
Paid Vacation
Parental Leave
Hybrid Work Options
Company Equity

Risks

Ongoing lawsuit with Concord Music Group could lead to financial liabilities.
Technological lag behind competitors like OpenAI may impact market position.
Reliance on substantial funding rounds may indicate financial instability.

Differentiation

Anthropic focuses on AI safety, contrasting with competitors' commercial priorities.
Claude, Anthropic's AI assistant, is designed for tasks of any scale.
Partnerships with tech giants like Panasonic and Amazon enhance Anthropic's strategic positioning.

Upsides

Anthropic's $60 billion valuation reflects strong investor confidence and growth potential.
Collaborations like the Umi app with Panasonic tap into the growing wellness AI market.
Focus on AI safety aligns with increasing industry emphasis on ethical AI development.

Land your dream remote job 3x faster with AI