Clickhouse

Site Reliability Engineer - Core C++ Team

Canada

Not SpecifiedCompensation
Junior (1 to 2 years)Experience Level
Full TimeJob Type
UnknownVisa
Database Systems, Cloud Services, Open Source SoftwareIndustries

Position Overview

  • Location Type: Remote (any country ClickHouse has a hiring presence)
  • Job Type: Full-time
  • Salary: Not specified

ClickHouse is a leading open-source column-oriented database system, focused on providing the fastest OLAP database globally. We empower users to generate real-time analytical reports via SQL queries, excelling at managing escalating data volumes. ClickHouse Cloud is utilized by enterprises worldwide, including Lyft, Sony, IBM, GitLab, and Twilio.

We are expanding our Site Reliability Engineering team for ClickHouse Core. As an early member of this team, you will be instrumental in building and leading processes to ensure and enhance the reliability, availability, scalability, and performance of ClickHouse. You will collaborate with various teams (Control Plane, Dataplane, Security, Support, Operations) to guide the optimal implementation of ClickHouse for our customers. This role involves owning engineering escalation management, response, investigations, blameless post-mortem analysis, and continuous improvement of ClickHouse operations and optimization in the cloud. This is a significant opportunity to impact our high-performance, elastic, and limitless scale ClickHouse Cloud offering.

Responsibilities

  • Continuously improve the reliability and performance of ClickHouse core.
  • Develop and implement metrics and alerts to proactively identify and prevent production issues before they impact customers.
  • Investigate common customer-encountered problems in ClickHouse Core to identify root causes, submit bug fixes, create issue reports, and suggest improvements.
  • Enhance and refine incident response processes and post-mortem analysis for ClickHouse core-related outages, including customer communication in coordination with support and Cloud teams.
  • Plan, enable, and drive Chaos Engineering initiatives across Engineering teams based on internal priorities.
  • Manage on-call processes to respond to performance and reliability issues, establishing best practices for coordinating escalations to minimize customer impact.

Requirements

  • Bachelor's or Master's degree in Computer Science or a related field.
  • Minimum of 8 years of experience in Reliability Engineering, QA, or customer-facing engineering.
  • Prior experience operating ClickHouse or other SQL databases in a production environment.
  • Strong understanding of distributed database internals and SQL; ClickHouse expertise is a significant advantage.
  • Proficiency in scripting languages such as Shell or Python.
  • Ability to read and understand C++ code.
  • Familiarity with cloud computing platforms like AWS, Azure, or Google Cloud Platform.
  • Excellent problem-solving skills and robust production debugging capabilities.
  • Ability to thrive in a fast-paced global team environment, acting as a business partner to drive progress.
  • High level of responsibility, ownership, and accountability.
  • Excellent communication skills.

Compensation

  • For roles based in the United States, salary ranges are provided.
  • Offer positioning within the range depends on candidate experience, qualifications, skills, business requirements, and geographical location.
  • For compensation inquiries, please contact paytransparency@clickhouse.com.

Perks

  • Flexible work environment. ClickHouse is a globally distributed company.

Skills

C++
Reliability Engineering
Monitoring and Alerting
Performance Optimization
Root Cause Analysis
Postmortem Analysis
Cloud Operations
Scalability
High-Performance Computing

Clickhouse

High-speed column-oriented database management system

About Clickhouse

ClickHouse provides a high-speed, column-oriented database management system designed for developers and businesses that manage large-scale data. Its primary product processes analytical queries quickly by storing data from the same columns together, making it significantly faster than traditional row-oriented databases, especially in Online Analytical Processing (OLAP) scenarios. ClickHouse stands out from competitors by offering a free, open-source database that can be deployed on local machines or in the cloud, along with a fully managed service on platforms like AWS, GCP, and Microsoft Azure. The company's goal is to deliver a cost-effective solution that simplifies data management for its clients, as evidenced by user feedback highlighting substantial cost savings.

San Francisco, CaliforniaHeadquarters
2021Year Founded
$291.8MTotal Funding
SERIES_BCompany Stage
Data & Analytics, Enterprise SoftwareIndustries
201-500Employees

Benefits

Health Insurance
Unlimited Paid Time Off
Flexible Work Hours
Remote Work Options
Stock Options
Home Office Stipend

Risks

Redpanda Serverless poses a competitive threat in real-time data processing.
Integration challenges with PeerDB may delay expected benefits.
Dependency on Supabase could pose operational risks.

Differentiation

ClickHouse's column-oriented design offers superior speed for analytical queries.
The open-source model allows flexible deployment across various environments.
Integration with Grafana enhances data visualization capabilities.

Upsides

Partnership with Alibaba Cloud boosts presence in the Chinese market.
Acquisition of PeerDB enhances real-time analytics capabilities.
Launch of ClickPipes improves data processing efficiency for real-time updates.

Land your dream remote job 3x faster with AI