AI Inference Engineer (London) at Perplexity AI

London, England, United Kingdom

Apply Now

Not SpecifiedCompensation

Mid-level (3 to 4 years), Senior (5 to 8 years)Experience Level

Full TimeJob Type

UnknownVisa

Technology, AIIndustries

Requirements

Experience with ML systems and deep learning frameworks (e.g. PyTorch, TensorFlow, ONNX)
Familiarity with common LLM architectures and inference optimization techniques (e.g. continuous batching, quantization, etc.)
Understanding of GPU architectures or experience with GPU kernel programming using CUDA

Responsibilities

Develop APIs for AI inference that will be used by both internal and external customers
Benchmark and address bottlenecks throughout our inference stack
Improve the reliability and observability of our systems and respond to system outages
Explore novel research and implement LLM inference optimizations

Skills

Key technologies and capabilities for this role

PythonRustC++PyTorchTritonCUDAKubernetesML systemsdeep learning frameworksTensorFlowONNXLLM architecturescontinuous batchingquantizationGPU kernel programming

Questions & Answers

Common questions about this position

What is the compensation for this AI Inference Engineer role?

Final offer amounts are determined by multiple factors, including experience and expertise. Equity may be part of the total compensation package in addition to the base salary.

Is this AI Inference Engineer position remote or based in London?

This information is not specified in the job description.

What skills are required for the AI Inference Engineer role?

Required skills include experience with ML systems and deep learning frameworks like PyTorch, TensorFlow, ONNX; familiarity with LLM architectures and inference optimizations such as continuous batching and quantization; and understanding of GPU architectures or CUDA kernel programming.

What does the team and work environment look like at Perplexity AI?

This is a growing team working on large-scale deployment of machine learning models for real-time inference, using a stack including Python, Rust, C++, PyTorch, Triton, CUDA, and Kubernetes.

What makes a strong candidate for this AI Inference Engineer position?

Strong candidates will have experience with ML systems, deep learning frameworks like PyTorch, LLM inference optimizations, and GPU programming with CUDA, as these directly align with the responsibilities of developing APIs, benchmarking, and improving inference systems.

Perplexity AI

Advanced answer engine providing reliable information

About Perplexity AI

Perplexity AI provides an advanced answer engine that delivers accurate and reliable responses to user queries. The platform uses current sources to ensure the information is both precise and relevant. It caters to a wide audience, including individuals looking for quick answers and businesses needing detailed information. Unlike many competitors, Perplexity AI emphasizes high-quality, source-backed answers, making it a valuable resource for users seeking trustworthy data. The company's goal is to meet the increasing demand for immediate access to reliable information, generating revenue through subscription fees, advertising, and partnerships.

San Francisco, CaliforniaHeadquarters

2022Year Founded

$890MTotal Funding

LATE_VCCompany Stage

Data & Analytics, Consumer SoftwareIndustries

201-500Employees

Benefits

Health Insurance

Dental Insurance

Vision Insurance

401(k) Retirement Plan

Company Equity

Risks

Pending copyright infringement class action poses legal and financial challenges.

Competition from Google's AI Mode could impact user retention and market share.

Otterly.AI's brand visibility tool may pressure Perplexity to maintain high performance.

Differentiation

Perplexity AI integrates large language models with search engines for precise responses.

The platform offers an open-source environment, enhancing public access to AI tools.

Perplexity's strategic acquisition of Carbon boosts its data connectivity capabilities.

Upsides

Partnership with Tripadvisor enhances travel planning with personalized recommendations.

$500M funding round increases valuation to $9 billion, supporting growth and innovation.

Integration with FactSet attracts financial clients with enhanced data accessibility.

Land your dream remote job 3x faster with AI

Try Jobo Free