ML Systems Engineer at Genmo

San Francisco, California, United States

Genmo Logo
$150,000 – $200,000Compensation
Mid-level (3 to 4 years)Experience Level
Full TimeJob Type
UnknownVisa
Artificial IntelligenceIndustries

Requirements

Candidates should possess a Bachelor’s or Master’s degree in Computer Science or a related field, along with a minimum of four years of Machine Learning engineering experience, with at least two years specifically focused on model serving. Production experience with high-performance model serving frameworks such as vLLM, SGLang, or TensorRT-LLM is required, alongside strong Python proficiency and experience with PyTorch. Familiarity with model compilation and optimization techniques, including TensorRT, ONNX, and quantization, is also necessary. A proven track record of building inference systems at scale, handling 10,000+ queries per second (QPS), is expected. Knowledge of attention mechanisms and transformer architectures, as well as GPU architecture and memory management, is valued.

Responsibilities

The ML Systems Engineer will design and implement high-performance model serving infrastructure to handle millions of requests daily, utilizing streaming, batching, and multi-modal inputs. They will develop automated model compilation and optimization pipelines, optimizing serving systems for throughput, latency, and GPU utilization. The role involves monitoring and observability of model-specific metrics, collaborating with researchers to transition models to production, and implementing A/B testing and canary deployments. Furthermore, the engineer will integrate the serving layer with platform infrastructure and contribute to advanced serving optimizations, including continuous batching and streaming generation patterns.

Skills

ML Serving
Model Optimization
TensorRT
PyTorch
Python
Containerization
Orchestration
Model Compilation
Transformer Architectures
Attention Mechanisms
API Gateways
Queue Systems

Genmo

AI tools for multimedia content creation

About Genmo

Genmo.ai specializes in providing AI tools for generating and editing multimedia content, including images, videos, and presentations. Users can upload images and animate specific parts, like transforming a static sky into a timelapse, or create entire movies by refining ideas, generating scenes, and selecting transitions. The platform caters to both individual content creators and businesses, operating on a subscription model with various service tiers. Genmo.ai differentiates itself by continuously enhancing its technology and focusing on user intent, ensuring that clients have powerful tools to realize their creative projects.

San Francisco, CaliforniaHeadquarters
N/AYear Founded
$29.2MTotal Funding
EARLY_VCCompany Stage
Consumer Software, AI & Machine LearningIndustries
1-10Employees

Risks

Server crashes during Mochi-1 launch could harm customer trust and satisfaction.
Open-source nature of Mochi-1 may lead to increased competition from developers.
Major tech players entering generative AI market could overshadow Genmo's offerings.

Differentiation

Genmo.ai offers unique AI tools for animating images and generating entire movies.
The platform supports both B2B and B2C models, catering to diverse client needs.
Genmo.ai's subscription model provides flexible access to advanced multimedia editing features.

Upsides

Launch of Mochi-1 model positions Genmo as a competitor to industry leaders.
Rising demand for AI-driven video editing boosts Genmo's market potential.
Subscription-based revenue model ensures steady income and opportunities for upselling.

Land your dream remote job 3x faster with AI