Sr. Infrastructure Engineer
Platform Science- Full Time
- Senior (5 to 8 years)
Candidates should have experience with Linux / Kubernetes systems and comfort working in a terminal, familiarity with infrastructure-as-code and Git-based workflows (e.g., Terraform, Flux, Kustomize), the ability to write and maintain basic tooling in Go, Python, or Bash, and understanding of networking fundamentals (IPAM, VLANs, DHCP, DNS). Working knowledge of storage concepts (block vs object, NFS, RAID, etc.) is also required, along with a strong sense of ownership and a willingness to dive into hardware, firmware, or low-level provisioning issues.
The Senior Infrastructure Engineer will support the provisioning and deployment of Kubernetes clusters on bare metal servers, help build and maintain tooling for bare metal provisioning — including DHCP, DNS, PXE/iPXE/HTTPBoot, and Talos Linux Machine Configuration, write and maintain scripts and services (Go, Python, Bash) to automate deployment workflows across new and existing sites, partner with data center operations and networking teams to ensure hardware is correctly configured, connected, and ready for use, and manage infrastructure configuration using tools like Git, Flux, and Terraform. They will also contribute to system documentation, runbooks, and tooling that makes our infrastructure reliable and repeatable.
AI inference technology for scalable solutions
Groq specializes in AI inference technology, providing the Groq LPU™, which is known for its high compute speed, quality, and energy efficiency. The Groq LPU™ is designed to handle AI processing tasks quickly and effectively, making it suitable for both cloud and on-premises applications. Unlike many competitors, Groq's products are designed, fabricated, and assembled in North America, which helps maintain high standards of quality and performance. The company targets a variety of clients across different industries that require fast and efficient AI processing capabilities. Groq's goal is to deliver scalable AI inference solutions that meet the growing demands for rapid data processing in the AI and machine learning market.