Site Reliability Engineer
Close- Full Time
- Junior (1 to 2 years)
Candidates should possess a Bachelor’s degree in Computer Science, Engineering, or a related field, and have at least 7 years of experience in Site Reliability Engineering, with a strong focus on distributed systems and cloud infrastructure. Experience with database technologies, particularly columnar databases like ClickHouse, is highly desirable. Strong understanding of operating systems, networking, and automation tools is also required.
As a Senior Site Reliability Engineer, you will collaborate with engineering teams to design and implement scalable, secure, and highly available systems for ClickHouse Cloud. You will establish and manage service level objectives (SLOs) and service level agreements (SLAs), ensure infrastructure components have monitoring and alerting, enhance incident response processes, and continuously improve the reliability and performance of ClickHouse services. Additionally, you will contribute to Chaos initiatives and guide teams in designing and implementing fault-tolerant distributed systems.
High-speed column-oriented database management system
ClickHouse provides a high-speed, column-oriented database management system designed for developers and businesses that manage large-scale data. Its primary product processes analytical queries quickly by storing data from the same columns together, making it significantly faster than traditional row-oriented databases, especially in Online Analytical Processing (OLAP) scenarios. ClickHouse stands out from competitors by offering a free, open-source database that can be deployed on local machines or in the cloud, along with a fully managed service on platforms like AWS, GCP, and Microsoft Azure. The company's goal is to deliver a cost-effective solution that simplifies data management for its clients, as evidenced by user feedback highlighting substantial cost savings.