Groq

Senior Software Engineer, FPGA

London, England, United Kingdom

Not SpecifiedCompensation
Senior (5 to 8 years)Experience Level
Full TimeJob Type
UnknownVisa
Artificial Intelligence, Semiconductor, Cloud ComputingIndustries

Position Overview

  • Location Type: Hybrid
  • Employment Type: Full-time
  • Salary: Not specified

Groq delivers fast, efficient AI inference. Our LPU-based system powers GroqCloud™, giving businesses and developers the speed and scale they need. Headquartered in Silicon Valley, we are on a mission to make high performance AI compute more accessible and affordable. When real-time AI is within reach, anything is possible. Build fast.

Sr. Software Engineer - FPGA

Responsibilities

  • Deliver end-to-end FPGA solutions bridging the gap between the world and our accelerators.
  • Build and operate real-time, distributed compute frameworks and runtimes to deliver planet-scale inference for LLMs and advanced AI applications at ultra-low latency, optimized for heterogeneous hardware and dynamic global workloads.
  • Develop deterministic, low-overhead hardware abstractions for thousands of synchronously coordinated GroqChips across a software-scheduled interconnection network. Prioritize fault tolerance, real-time diagnostics, ultra-low-latency execution, and mission-critical reliability.
  • Future-proof Groq’s software stack for next-gen silicon, innovative multi-chip topologies, emerging form factors, and heterogeneous co-processors (e.g., FPGAs).
  • Foster collaboration across cloud, compiler, infra, data centers, and hardware teams to align engineering efforts, enable seamless integrations, and drive progress toward shared goals.
  • Your code will run at the edge of physics—every clock cycle saved reduces latency for millions of users and extends Groq’s lead in the AI compute race.

Requirements

  • Deep curiosity about system internals—from kernel-level interactions to hardware dependencies—and fearless enough to solve problems across abstraction layers down to the HDL for our chips.
  • Expertise in computer architecture, compiler backends, algorithms and hardware-software interfaces.
  • Mastery of system-level programming (Haskell, C++, or similar) with emphasis on low-level optimizations and hardware-aware design.
  • Consistent shipment of high-impact, production-ready code while collaborating effectively with cross-functional teams.
  • Excellence in profiling and optimizing systems for latency, throughput, and efficiency, with zero tolerance for wasted cycles or resources.
  • Commitment to automated testing and CI/CD pipelines, believing that "untested code is broken code."
  • Pragmatic technical judgments, balancing short-term velocity with long-term system health.
  • Writing empathetic, maintainable code with strong version control and modular design, prioritizing readability and usability for future teammates.

Nice to Have

  • Experience with FPGA development.
  • Experience with VFIO drivers.
  • Familiarity with HDL languages.
  • Experience shipping complex projects in fast-paced environments while maintaining team alignment and stakeholder support.
  • Hands-on optimization of performance-critical applications using GPUs, FPGAs, or ASICs (e.g., memory management, kernel optimization).
  • Familiarity with ML frameworks (e.g., PyTorch) and compiler tooling (e.g., MLIR) for AI/ML workflow integration.

The Ideal Candidate

  • Initiates (without derailing): Spots opportunities to solve problems or improve processes—while staying aligned with team priorities.
  • Builds stuff that actually ships: Values “code in prod” over “perfect slides.” Delivers real value instead of polishing whiteboard ideas.
  • Is a craftsmanship junkie: Always asks, “How can we make this better?” and loves diving into details.
  • Plays to win (together): Believes winning means everyone wins. Aligns goals with teammates and customers because no one succeeds alone.
  • Owns it from whiteboard to watts: Takes full responsibility—debug it, deploy it, celebrate with users (or fix it again). Ensures code stays fast, scales well, and takes ownership of outcomes.

This isn't your typical corporate job—it’s a mission to redefine AI com

Skills

FPGA
Haskell
C++
Computer Architecture
Compiler Backends
Algorithms
HDL
Low-level Programming
System-level Programming

Groq

AI inference technology for scalable solutions

About Groq

Groq specializes in AI inference technology, providing the Groq LPU™, which is known for its high compute speed, quality, and energy efficiency. The Groq LPU™ is designed to handle AI processing tasks quickly and effectively, making it suitable for both cloud and on-premises applications. Unlike many competitors, Groq's products are designed, fabricated, and assembled in North America, which helps maintain high standards of quality and performance. The company targets a variety of clients across different industries that require fast and efficient AI processing capabilities. Groq's goal is to deliver scalable AI inference solutions that meet the growing demands for rapid data processing in the AI and machine learning market.

Mountain View, CaliforniaHeadquarters
2016Year Founded
$1,266.5MTotal Funding
SERIES_DCompany Stage
AI & Machine LearningIndustries
201-500Employees

Benefits

Remote Work Options
Company Equity

Risks

Increased competition from SambaNova Systems and Gradio in high-speed AI inference.
Geopolitical risks in the MENA region may affect the Saudi Arabia data center project.
Rapid expansion could strain Groq's operational capabilities and supply chain.

Differentiation

Groq's LPU offers exceptional compute speed and energy efficiency for AI inference.
The company's products are designed and assembled in North America, ensuring high quality.
Groq emphasizes deterministic performance, providing predictable outcomes in AI computations.

Upsides

Groq secured $640M in Series D funding, boosting its expansion capabilities.
Partnership with Aramco Digital aims to build the world's largest inferencing data center.
Integration with Touchcast's Cognitive Caching enhances Groq's hardware for hyper-speed inference.

Land your dream remote job 3x faster with AI