Alignment Data Scientist
AE StudioFull Time
Junior (1 to 2 years)
Candidates should possess strong statistical skills and experience evaluating scientific experiments related to data collection and model performance, extremely strong software engineering skills, and expertise in designing and conducting data collection tasks, including working with human annotators. They should have experience analyzing datasets with respect to their quality, biases, and suitability for training ML models, hands-on experience training large language models (LLMs) on distributed training infrastructures, and familiarity with evaluating and improving the generalizability and robustness of ML systems. Proficiency in programming languages such as Python and ML frameworks (e.g., PyTorch, TensorFlow, JAX) is required, along with a demonstrated record of publication in top-tier machine learning venues such as NeurIPS, ICML, ICLR, AIStats, MLSys, JMLR, AAAI, Nature, COLING, ACL, or EMNLP.
As a Member of Technical Staff, you will focus on data generation, post-training algorithms, and evaluation methods to ensure Safety in the next generation of models that can access external resources and take actions in the world, working closely with cross-functional machine learning teams and data annotation teams, and collaborating with product and policy teams. You will be responsible for making the next generation of LLMs better for society as a whole, tackling new scientific problems, implementing solutions, and diving into messy data and results.
Provides NLP tools and LLMs via API
Cohere provides advanced Natural Language Processing (NLP) tools and Large Language Models (LLMs) through a user-friendly API. Their services cater to a wide range of clients, including businesses that want to improve their content generation, summarization, and search functions. Cohere's business model focuses on offering scalable and affordable generative AI tools, generating revenue by granting API access to pre-trained models that can handle tasks like text classification, sentiment analysis, and semantic search in multiple languages. The platform is customizable, enabling businesses to create smarter and faster solutions. With multilingual support, Cohere effectively addresses language barriers, making it suitable for international use.