Alignment Data Scientist
AE Studio- Full Time
- Junior (1 to 2 years)
Candidates should possess significant software, ML, or research engineering experience, along with some experience contributing to empirical AI research projects. Familiarity with machine learning techniques and research methodologies is expected, and a strong background in computer science or a related field is preferred.
As a Research Engineer, you will build and run elegant and thorough machine learning experiments to understand and steer the behavior of powerful AI systems, focusing on AI safety and risks from future systems. You will contribute to exploratory experimental research, often in collaboration with other teams, and develop techniques for scalable oversight, AI control, and alignment stress-testing. Specifically, you will conduct projects such as testing the robustness of safety techniques, running multi-agent reinforcement learning experiments, building tooling to evaluate jailbreaks, writing scripts and prompts for evaluation questions, and contributing to research papers, blog posts, and talks. You will also run experiments that feed into key AI safety efforts, like the design and implementation of Anthropic's Responsible Scaling Policy.
Develops reliable and interpretable AI systems
Anthropic focuses on creating reliable and interpretable AI systems. Its main product, Claude, serves as an AI assistant that can manage tasks for clients across various industries. Claude utilizes advanced techniques in natural language processing, reinforcement learning, and code generation to perform its functions effectively. What sets Anthropic apart from its competitors is its emphasis on making AI systems that are not only powerful but also understandable and controllable by users. The company's goal is to enhance operational efficiency and improve decision-making for its clients through the deployment and licensing of its AI technologies.