Data Scientist
RoutableFull Time
Junior (1 to 2 years)
Candidates should possess a Bachelor's degree in Computer Science, Data Science, or a related field, with a Master's degree preferred, or equivalent experience. Strong software engineering skills, particularly in Python, are required, along with a solid understanding of data structures, algorithms, and software design principles. Experience analyzing datasets with respect to quality and suitability for training ML models is also necessary. Prior experience with multilingual data and a passion for natural language processing is a plus, and having one or more first-author papers at top-tier venues such as NeurIPS, ICML, ICLR, AIStats, MLSys, JMLR, AAAI, Nature, COLING, ACL, or EMNLP is beneficial.
As a Member of Technical Staff, you will design and implement data pipelines to process and prepare multilingual datasets, collaborating with researchers to understand data requirements and model performance. You will develop tools and scripts to automate data-related tasks and improve efficiency, ensuring data quality and integrity through rigorous testing and validation. Staying updated on the latest advancements in multilingual data processing and contributing to the team's knowledge base are also key responsibilities.
Provides NLP tools and LLMs via API
Cohere provides advanced Natural Language Processing (NLP) tools and Large Language Models (LLMs) through a user-friendly API. Their services cater to a wide range of clients, including businesses that want to improve their content generation, summarization, and search functions. Cohere's business model focuses on offering scalable and affordable generative AI tools, generating revenue by granting API access to pre-trained models that can handle tasks like text classification, sentiment analysis, and semantic search in multiple languages. The platform is customizable, enabling businesses to create smarter and faster solutions. With multilingual support, Cohere effectively addresses language barriers, making it suitable for international use.