Clickhouse

Senior Python Engineer - ML and Data Science

Netherlands

Not SpecifiedCompensation
Senior (5 to 8 years), Expert & Leadership (9+ years)Experience Level
Full TimeJob Type
UnknownVisa
Database, Analytics, SoftwareIndustries

About ClickHouse

Established in 2009, ClickHouse leads the industry with its open-source column-oriented database system, driven by the vision of becoming the fastest OLAP database globally. The company empowers users to generate real-time analytical reports through SQL queries, emphasizing speed in managing escalating data volumes. Enterprises globally, including Lyft, Sony, IBM, GitLab, Twilio, HubSpot, and many more, rely on ClickHouse Cloud. It is available through open-source or on AWS, GCP, Azure, and Alibaba.

ClickHouse is the world’s fastest open-source columnar database for real-time analytics. We empower data-driven organizations with high-performance, scalable analytics solutions. As we continue expanding, we are looking for a Senior Python Software Engineer to enhance our Python capabilities and improve the experience for data scientists and machine learning teams using ClickHouse.

As a Data Science-Focused Software Engineer, you will be at the intersection of engineering and data science. Your primary responsibility will be to enhance ClickHouse’s Python ecosystem, ensuring seamless integration for data scientists, ML engineers, and analytics professionals for data exploration and preparation workloads. You will work closely with engineering, product, and community teams to optimize Python libraries, enhance data science workflows, and contribute to open-source initiatives.

What will you do?

  • Improve ClickHouse's Data Science Python experience: Design and implement features that simplify data ingestion, transformation, and analysis for data scientists using Python.
  • Contribute to ClickHouse's Python Integrations: Enhance existing ClickHouse’s Python integrations to provide a seamless data science experience.
  • Work with Open-Source Ecosystem: Contribute to ClickHouse’s open-source repositories, ensuring compatibility with popular data science toolkits.
  • Performance Optimization: Ensure efficient query execution and data handling when interfacing with Python.
  • Collaborate with Internal & External Teams: Work closely with product managers, engineers, and the data science community to gather feedback and refine features.
  • Advocate for Best Practices: Educate users on best practices for leveraging ClickHouse in data science applications via examples and reference architectures.

About you

  • Strong proficiency in Python and familiarity with data science libraries and data processing frameworks such as Pandas, Polars, Scikit-learn, PyTorch.
  • Experience working with databases, query engines, or analytical tools (Snowflake, BigQuery, ClickHouse, DuckDB, etc.).
  • Understanding of distributed systems and OLAP databases.
  • Hands-on experience with data engineering workflows and integrating databases with machine learning or analytics pipelines.
  • Contributions to open-source projects or experience working in an open-source development environment.
  • Strong problem-solving skills and ability to work in a cross-functional team.
  • Strong communication skills.

Nice-to-Have

  • Prior experience with ClickHouse or cloud-based OLAP analytics systems.
  • Understanding of ML workloads and integration with analytical databases.
  • Prior experience working in a developer-first or database technology company.

Compensation

For roles based in the United States, you can find above our typical starting salary ranges for this role, depending on your specific location. The positioning of offers within a certain range depends on various factors, including: candidate experience, qualifications, skills, business requirements and geographical location. If you have any questions or comments about compensation as a candidate, please get in touch with us at paytransparency@clickhouse.com.

Perks

  • Flexible work environment - ClickHouse is a globally distributed company and remote-friendly. We currently operate in 20 countries.
  • Healthcare - Employer contributions towards your healthcare.
  • Equity in the company - Every new team member who joins our compa

Skills

Python
Data Science
Machine Learning
SQL
Data Ingestion
Data Transformation
Data Analysis
Performance Optimization
Open-Source

Clickhouse

High-speed column-oriented database management system

About Clickhouse

ClickHouse provides a high-speed, column-oriented database management system designed for developers and businesses that manage large-scale data. Its primary product processes analytical queries quickly by storing data from the same columns together, making it significantly faster than traditional row-oriented databases, especially in Online Analytical Processing (OLAP) scenarios. ClickHouse stands out from competitors by offering a free, open-source database that can be deployed on local machines or in the cloud, along with a fully managed service on platforms like AWS, GCP, and Microsoft Azure. The company's goal is to deliver a cost-effective solution that simplifies data management for its clients, as evidenced by user feedback highlighting substantial cost savings.

San Francisco, CaliforniaHeadquarters
2021Year Founded
$291.8MTotal Funding
SERIES_BCompany Stage
Data & Analytics, Enterprise SoftwareIndustries
201-500Employees

Benefits

Health Insurance
Unlimited Paid Time Off
Flexible Work Hours
Remote Work Options
Stock Options
Home Office Stipend

Risks

Redpanda Serverless poses a competitive threat in real-time data processing.
Integration challenges with PeerDB may delay expected benefits.
Dependency on Supabase could pose operational risks.

Differentiation

ClickHouse's column-oriented design offers superior speed for analytical queries.
The open-source model allows flexible deployment across various environments.
Integration with Grafana enhances data visualization capabilities.

Upsides

Partnership with Alibaba Cloud boosts presence in the Chinese market.
Acquisition of PeerDB enhances real-time analytics capabilities.
Launch of ClickPipes improves data processing efficiency for real-time updates.

Land your dream remote job 3x faster with AI