Senior Research Scientist, Model Evaluation at Cohere

Toronto, Ontario, Canada

Cohere Logo
Not SpecifiedCompensation
Senior (5 to 8 years)Experience Level
Full TimeJob Type
UnknownVisa
Artificial Intelligence, Machine LearningIndustries

Requirements

  • Enjoy rapidly building prototypes that demonstrate the boundaries of what LLMs are capable of, and have developed resources to measure those capabilities
  • Have spent dozens of hours reviewing complex data and LLM outputs to ensure high data quality
  • Are obsessive about rigorously measuring AI capabilities, and about making sure your measurements actually align with the capabilities you care about
  • Have strong software engineering skills

Responsibilities

  • Create ambitious new evaluation benchmarks that push the limits of what our models can accomplish
  • Work on highly cross-functional teams to translate model feedback into trustworthy, repeatable evaluations
  • Conduct research to advance the state-of-the-art in LLM evaluation methods, including training LLM judges; refining LLM-based data synthesis pipelines; and improving evaluation efficiency
  • Build scalable and reusable tools for digging into model performance

Skills

Key technologies and capabilities for this role

LLM EvaluationBenchmark CreationLLM JudgesData SynthesisEvaluation InfrastructureModel Performance AnalysisPrototypingScalable ToolsResearch MethodsCross-functional Collaboration

Questions & Answers

Common questions about this position

What is the work arrangement for this role?

The position is hybrid.

What are the key responsibilities of a Senior Research Scientist, Model Evaluation?

Responsibilities include creating new evaluation benchmarks, working on cross-functional teams for trustworthy evaluations, conducting research on LLM evaluation methods like training LLM judges, and building scalable tools for model performance analysis.

What skills and experiences make someone a good fit for this role?

Ideal candidates enjoy building prototypes to test LLM boundaries, have extensive experience reviewing complex data and LLM outputs for quality, are obsessive about rigorous AI measurements that align with real capabilities, and possess strong software engineering skills.

What is the company culture like at Cohere?

Cohere has an open and inclusive culture, with a team of top researchers, engineers, and designers who obsess over their work, move fast for customers, and value diverse perspectives.

What perks do full-time employees receive at Cohere?

Full-time employees enjoy an open and inclusive culture and work environment, and work closely with a team on the cutting edge.

Cohere

Provides NLP tools and LLMs via API

About Cohere

Cohere provides advanced Natural Language Processing (NLP) tools and Large Language Models (LLMs) through a user-friendly API. Their services cater to a wide range of clients, including businesses that want to improve their content generation, summarization, and search functions. Cohere's business model focuses on offering scalable and affordable generative AI tools, generating revenue by granting API access to pre-trained models that can handle tasks like text classification, sentiment analysis, and semantic search in multiple languages. The platform is customizable, enabling businesses to create smarter and faster solutions. With multilingual support, Cohere effectively addresses language barriers, making it suitable for international use.

Toronto, CanadaHeadquarters
2019Year Founded
$914.4MTotal Funding
SERIES_DCompany Stage
AI & Machine LearningIndustries
501-1,000Employees

Risks

Competitors like Google and Microsoft may overshadow Cohere with seamless enterprise system integration.
Reliance on Nvidia chips poses risks if supply chain issues arise or strategic focus shifts.
High cost of AI data center could strain financial resources if government funding is delayed.

Differentiation

Cohere's North platform outperforms Microsoft Copilot and Google Vertex AI in enterprise functions.
Rerank 3.5 model processes queries in over 100 languages, enhancing multilingual search capabilities.
Command R7B model excels in RAG, math, and coding, outperforming competitors like Google's Gemma.

Upsides

Cohere's AI data center project positions it as a key player in Canadian AI.
North platform offers secure AI deployment for regulated industries, enhancing privacy-focused enterprise solutions.
Cohere's multilingual support breaks language barriers, expanding its global market reach.

Land your dream remote job 3x faster with AI