Senior Software Engineer - Observability
TetraScienceFull Time
Senior (5 to 8 years)
Candidates should possess 4+ years of experience in observability as a core responsibility of previous roles, a deep understanding of cloud-native technologies and infrastructure as a service (IaaS) such as Terraform and Flux, expertise in standing up and running monitoring, observability, and alerting systems - OpenTelemetry Tracing and Collector, Grafana/Prometheus, PagerDuty, AlertManager, IPMI, SNMP, and familiarity with and strong opinions on signals including effective canonical logging and cost control, tracing expertise including context propagation, tail sampling strategies, attribute enrichment, querying, and metrics derived from a variety of systems such as hosts, kube-state-metrics, kubelet, IPMI, SNMP. Experience instrumenting large Kubernetes clusters and building operators is also required.
The Staff Observability Engineer will build and maintain comprehensive observability systems at massive scale, constantly iterating on, maintaining, updating, and automating their own systems, and putting in place great monitoring of their own systems that can be used as best practices by the rest of the organization. They will instrument Kubernetes clusters, applications, and datacenter infrastructure components such as switches, PDUs, environmental sensors, cameras, chillers, etc. Additionally, they will be a teacher, advising teams on instrumenting their applications in various languages (Rust, C++, TypeScript, GoLang), implementing sensible SLO and alerting strategies, and on-call best practices.
AI inference technology for scalable solutions
Groq specializes in AI inference technology, providing the Groq LPU™, which is known for its high compute speed, quality, and energy efficiency. The Groq LPU™ is designed to handle AI processing tasks quickly and effectively, making it suitable for both cloud and on-premises applications. Unlike many competitors, Groq's products are designed, fabricated, and assembled in North America, which helps maintain high standards of quality and performance. The company targets a variety of clients across different industries that require fast and efficient AI processing capabilities. Groq's goal is to deliver scalable AI inference solutions that meet the growing demands for rapid data processing in the AI and machine learning market.