Staff Data Engineer
PostscriptFull Time
Expert & Leadership (9+ years)
Candidates must possess over 5 years of software development experience focused on data-intensive solutions. A strong command of Java and the JVM ecosystem, including memory management and performance tuning, is essential. Experience with concurrent programming in Java, including threads and asynchronous patterns, is required. Familiarity with connectors, sinks, or sources for big data frameworks like Apache Spark, Flink, Beam, or Kafka Connect is necessary. A solid understanding of database fundamentals, SQL, data modeling, query optimization, and OLAP databases is expected. Excellent communication skills for collaboration are crucial. A passion for open-source development and community engagement is a must. Bonus points for prior OSS contributions, familiarity with ClickHouse or similar platforms, expertise in building big data connectors, knowledge of Python for data engineering, and understanding of JDBC and network protocols.
As a Senior Software Engineer, you will be a core contributor to ClickHouse's Data engineering ecosystem, focusing on JVM-based frameworks. You will own and maintain critical parts of the data framework integrations, from database drivers to SDKs and connectors. Your work will involve crafting tools for Data Engineers to leverage ClickHouse's speed and scale, impacting how companies process massive datasets. You will collaborate with the open-source community, internal teams, and enterprise users to ensure high-performance, reliable, and developer-friendly JVM integrations.
High-speed column-oriented database management system
ClickHouse provides a high-speed, column-oriented database management system designed for developers and businesses that manage large-scale data. Its primary product processes analytical queries quickly by storing data from the same columns together, making it significantly faster than traditional row-oriented databases, especially in Online Analytical Processing (OLAP) scenarios. ClickHouse stands out from competitors by offering a free, open-source database that can be deployed on local machines or in the cloud, along with a fully managed service on platforms like AWS, GCP, and Microsoft Azure. The company's goal is to deliver a cost-effective solution that simplifies data management for its clients, as evidenced by user feedback highlighting substantial cost savings.