Principal Site Reliability Engineer
Devoted Health- Full Time
- Senior (5 to 8 years)
Candidates must possess extensive experience with enterprise-scale continuous delivery environments, 10+ years of experience in a DevOps or SRE role, and development experience with JavaScript/Node.js/TypeScript in a Linux/Mac environment. Experience with Infrastructure as Code (IaC) tools like Terraform is preferred, along with experience in sustainable incident response in a blameless environment. Deep understanding of SRE practices, including SLOs, Error Budgets, PRRs, and Problem Management is required, and familiarity with cloud platforms (preferably AWS) and container/orchestration technologies is also necessary.
As a Principal Site Reliability Engineer, you will chart the future of Cribl’s observability and reliability systems and practices, conceptualize and direct the evolution of reliability metrics and processes, engage with Product and Engineering teams to improve service delivery, measure and monitor production systems, uncover risks and sources of errors, advocate for engineering-wide improvements in reliability and observability, identify and drive down toil with innovation and automation, and participate in on-call rotations.
Data observability solutions for tech businesses
Cribl operates in the data observability market, helping tech businesses monitor, analyze, and visualize their data for better operational and security insights. The company offers two main products: Cribl Stream and Cribl Edge. Cribl Stream enables businesses to efficiently route and transform logs and metrics, either on their own infrastructure or through cloud services, ensuring timely data delivery. Cribl Edge focuses on collecting and processing real-time observability data from edge devices, which can then be sent to Cribl Stream or other destinations. Cribl distinguishes itself by integrating seamlessly with platforms like Office 365 and Microsoft Azure, allowing clients to enhance their data management capabilities. The company's goal is to create effective data ecosystems that empower organizations to make sense of their data.