Machine Learning Engineer
HangFull Time
Senior (5 to 8 years)
Key technologies and capabilities for this role
Common questions about this position
The position is hybrid, with offices in London, Toronto, San Francisco, New York, and remote-friendly options.
Candidates need extremely strong software engineering skills, proficiency in Python and ML frameworks like JAX, Pytorch, and XLA/MLIR, experience writing CUDA and Triton kernels for GPUs, large-scale distributed training strategies, and familiarity with autoregressive sequence models like Transformers.
This information is not specified in the job description.
Cohere emphasizes a culture of hard work, fast movement, and customer focus, with a diverse team of professionals passionate about scaling intelligence to serve humanity through training frontier models.
A strong candidate has extremely strong software engineering skills, GPU kernel experience with CUDA and Triton, proficiency in Python ML frameworks, distributed training expertise, and familiarity with Transformers; publications at top ML conferences like NeurIPS or ICML are a bonus.
Provides NLP tools and LLMs via API
Cohere provides advanced Natural Language Processing (NLP) tools and Large Language Models (LLMs) through a user-friendly API. Their services cater to a wide range of clients, including businesses that want to improve their content generation, summarization, and search functions. Cohere's business model focuses on offering scalable and affordable generative AI tools, generating revenue by granting API access to pre-trained models that can handle tasks like text classification, sentiment analysis, and semantic search in multiple languages. The platform is customizable, enabling businesses to create smarter and faster solutions. With multilingual support, Cohere effectively addresses language barriers, making it suitable for international use.