Groq

Machine Learning Engineer, Post Training

Toronto, Ontario, Canada

Not SpecifiedCompensation
Senior (5 to 8 years)Experience Level
Full TimeJob Type
UnknownVisa
Artificial Intelligence, AI & Machine Learning, Cloud ComputingIndustries

Machine Learning Engineer, Post Training

About Groq

Groq delivers fast, efficient AI inference. Our LPU-based system powers GroqCloud™, giving businesses and developers the speed and scale they need. Headquartered in Silicon Valley, we are on a mission to make high performance AI compute more accessible and affordable. When real-time AI is within reach, anything is possible. Build fast.

Position Overview

We are seeking a highly skilled Machine Learning Engineer to join our advanced model development team. This role focuses on pre-training, continued training, and post-training of models, with a particular emphasis on draft model optimization for speculative decoding and quantization-aware training (QAT). The ideal candidate has deep experience with training methodologies, open-weight models, and performance-tuning for inference.

Responsibilities & Outcomes

  • Lead pre-training and post-training efforts for draft models tailored to speculative decoding architectures.
  • Conduct continued training and post-training of open-weight models for non-draft (standard) inference scenarios.
  • Implement and optimize quantization-aware training pipelines to enable low-precision inference with minimal accuracy loss.
  • Collaborate with model architecture, inference, and systems teams to evaluate model readiness across training and deployment stages.
  • Develop tooling and evaluation metrics for training effectiveness, draft model fidelity, and speculative hit-rate optimization.
  • Contribute to experimental designs for novel training regimes and speculative decoding strategies.

Ideal Candidates Have/Are

  • 5+ years of experience in machine learning, with a strong focus on model training.
  • Proven experience with transformer-based architectures (e.g., LLaMA, Mistral, Gemma).
  • Deep understanding of speculative decoding and draft model usage.
  • Hands-on experience with quantization-aware training, including PyTorch QAT workflows or similar frameworks.
  • Familiarity with open-weight foundation models and continued/pre-training techniques.
  • Proficient in Python and ML frameworks such as PyTorch, JAX, or TensorFlow.

Preferred Qualifications

  • Experience optimizing models for fast inference and sampling in production environments.
  • Exposure to distributed training, low-level kernel optimizations, and inference-time system constraints.
  • Publications or contributions to open-source ML projects.

Attributes of a Groqster

  • Humility: Egos are checked at the door.
  • Collaborative & Team Savvy: We make up the smartest person in the room, together.
  • Growth & Giver Mindset: Learn it all versus know it all, we share knowledge generously.
  • Curious & Innovative: Take a creative approach to projects, problems, and design.
  • Passion, Grit, & Boldness: No limit thinking, fueling informed risk taking.

If this sounds like you, we’d love to hear from you!

Compensation

  • Salary: TBD (determined by skills, qualifications, experience, and internal benchmarks).
  • At Groq, a competitive base salary is part of our comprehensive compensation package, which includes equity and benefits.

Location

  • Location Type: Some roles may require being located near or on our primary sites, as indicated in the job description.

Company Information

At Groq, our goal is to hire and promote an exceptional workforce as diverse as the global populations we serve. Groq is an equal opportunity employer committed to diversity, inclusion, and belonging in all aspects of our organization. We value and celebrate diversity in thought, beliefs, talent, expression, and backgrounds. We know that our individual differences make us better.

Groq is an Equal Opportunity Employer that is committed to inclusion and diversity. Qualified applicants will receive consideration for employment without regard to race, color, religion, national origin, gender, sexual orientation, gender identity, disability or protected veteran status. We also take affirmative action.

Skills

Machine Learning
Model Training
Speculative Decoding
Quantization-Aware Training (QAT)
Transformer Architectures
LLaMA
Mistral
Gemma
PyTorch
Performance Tuning
Inference Optimization
Tooling Development
Evaluation Metrics

Groq

AI inference technology for scalable solutions

About Groq

Groq specializes in AI inference technology, providing the Groq LPU™, which is known for its high compute speed, quality, and energy efficiency. The Groq LPU™ is designed to handle AI processing tasks quickly and effectively, making it suitable for both cloud and on-premises applications. Unlike many competitors, Groq's products are designed, fabricated, and assembled in North America, which helps maintain high standards of quality and performance. The company targets a variety of clients across different industries that require fast and efficient AI processing capabilities. Groq's goal is to deliver scalable AI inference solutions that meet the growing demands for rapid data processing in the AI and machine learning market.

Mountain View, CaliforniaHeadquarters
2016Year Founded
$1,266.5MTotal Funding
SERIES_DCompany Stage
AI & Machine LearningIndustries
201-500Employees

Benefits

Remote Work Options
Company Equity

Risks

Increased competition from SambaNova Systems and Gradio in high-speed AI inference.
Geopolitical risks in the MENA region may affect the Saudi Arabia data center project.
Rapid expansion could strain Groq's operational capabilities and supply chain.

Differentiation

Groq's LPU offers exceptional compute speed and energy efficiency for AI inference.
The company's products are designed and assembled in North America, ensuring high quality.
Groq emphasizes deterministic performance, providing predictable outcomes in AI computations.

Upsides

Groq secured $640M in Series D funding, boosting its expansion capabilities.
Partnership with Aramco Digital aims to build the world's largest inferencing data center.
Integration with Touchcast's Cognitive Caching enhances Groq's hardware for hyper-speed inference.

Land your dream remote job 3x faster with AI