Cohere

Senior Tech Lead Manager, Model Efficiency

San Francisco, California, United States

Apply now

Not SpecifiedCompensation

Senior (5 to 8 years)Experience Level

Full TimeJob Type

UnknownVisa

Artificial Intelligence, AI & Machine Learning, Deep Learning, Model OptimizationIndustries

Requirements

Candidates should possess 3+ years of experience managing engineering teams with a demonstrable impact on system performance metrics and team growth, along with extensive experience in transformer architecture optimizations. They should also have a strong understanding of ML accelerator architectures (GPUs, TPUs, custom ASICs), memory hierarchies, and hardware-aware optimization techniques including tensor core utilization and parallel computation patterns.

Responsibilities

As a Senior Tech Lead Manager, Model Efficiency, you will architect comprehensive technical roadmaps with quantifiable performance metrics aligned with product requirements, identify and address technical competency gaps through strategic hiring, demonstrate expert-level understanding of ML accelerator architectures and inference frameworks, collaborate with MLOps and infrastructure teams, implement agile methodologies, provide technical mentorship, and integrate optimizations into production systems.

Skills

transformer architecture

ML accelerator architectures

GPUs

TPUs

ASICs

memory hierarchies

hardware-aware optimizations

inference frameworks

TensorRT

ONNX Runtime

PyTorch JIT

quantization strategies

operator fusion

system performance metrics

profiling techniques

hardware-software co-design

Cohere

Provides NLP tools and LLMs via API

About Cohere

Cohere provides advanced Natural Language Processing (NLP) tools and Large Language Models (LLMs) through a user-friendly API. Their services cater to a wide range of clients, including businesses that want to improve their content generation, summarization, and search functions. Cohere's business model focuses on offering scalable and affordable generative AI tools, generating revenue by granting API access to pre-trained models that can handle tasks like text classification, sentiment analysis, and semantic search in multiple languages. The platform is customizable, enabling businesses to create smarter and faster solutions. With multilingual support, Cohere effectively addresses language barriers, making it suitable for international use.

Key Metrics

Toronto, CanadaHeadquarters

2019Year Founded

$914.4MTotal Funding

SERIES_DCompany Stage

AI & Machine LearningIndustries

501-1,000Employees

Risks

Competitors like Google and Microsoft may overshadow Cohere with seamless enterprise system integration.

Reliance on Nvidia chips poses risks if supply chain issues arise or strategic focus shifts.

High cost of AI data center could strain financial resources if government funding is delayed.

Differentiation

Cohere's North platform outperforms Microsoft Copilot and Google Vertex AI in enterprise functions.

Rerank 3.5 model processes queries in over 100 languages, enhancing multilingual search capabilities.

Command R7B model excels in RAG, math, and coding, outperforming competitors like Google's Gemma.

Upsides

Cohere's AI data center project positions it as a key player in Canadian AI.

North platform offers secure AI deployment for regulated industries, enhancing privacy-focused enterprise solutions.

Cohere's multilingual support breaks language barriers, expanding its global market reach.

Land your dream remote job 3x faster with AI

Try Jobo Free