Software Engineer - Model Performance
Baseten- Full Time
- Mid-level (3 to 4 years), Senior (5 to 8 years)
Candidates should possess experience with serving ML models in production, designing, implementing, and maintaining a production service at scale, and a strong intuition for system behavior and resource estimation under different workloads. Familiarity with inference characteristics of deep learning models, specifically Transformer based architectures, and computational characteristics of accelerators (GPUs, TPUs, and/or Inferentia) is required, along with experience in performance benchmarking, profiling, and optimization. Strong understanding or working experience with distributed systems and familiarity with cloud infrastructure (e.g., AWS, GCP) are also necessary, alongside proficiency in Golang (or other languages designed for high-performance scalable servers).
As a Member of Technical Staff, Model Serving, you will be responsible for developing, deploying, and operating the AI platform delivering Cohere’s large language models through easy to use API endpoints, working closely with various teams to serve optimized LLM models to production in low latency, high throughput, and high availability environments. You will also have the opportunity to interface with customers and create customized deployments to meet their specific needs.
Provides NLP tools and LLMs via API
Cohere provides advanced Natural Language Processing (NLP) tools and Large Language Models (LLMs) through a user-friendly API. Their services cater to a wide range of clients, including businesses that want to improve their content generation, summarization, and search functions. Cohere's business model focuses on offering scalable and affordable generative AI tools, generating revenue by granting API access to pre-trained models that can handle tasks like text classification, sentiment analysis, and semantic search in multiple languages. The platform is customizable, enabling businesses to create smarter and faster solutions. With multilingual support, Cohere effectively addresses language barriers, making it suitable for international use.