Clickhouse

Senior Python Engineer - ML and Data Science

Netherlands

Not SpecifiedCompensation
Senior (5 to 8 years), Expert & Leadership (9+ years)Experience Level
Full TimeJob Type
UnknownVisa
Database, Analytics, SoftwareIndustries

Requirements

Candidates must possess strong proficiency in Python and familiarity with data science libraries and data processing frameworks such as Pandas, Polars, Scikit-learn, and PyTorch. Experience working with databases, query engines, or analytical tools like Snowflake, BigQuery, ClickHouse, or DuckDB is required. A solid understanding of distributed systems and OLAP databases, along with hands-on experience in data engineering workflows and integrating databases with machine learning or analytics pipelines, is essential. Contributions to open-source projects or experience in an open-source development environment, strong problem-solving skills, the ability to work in a cross-functional team, and excellent communication skills are also necessary.

Responsibilities

The Senior Python Engineer will improve ClickHouse's Python ecosystem and the experience for data scientists and machine learning teams. Responsibilities include designing and implementing features to simplify data ingestion, transformation, and analysis for Python users, enhancing existing Python integrations for a seamless data science experience, and contributing to ClickHouse's open-source repositories. The role also involves ensuring efficient query execution and data handling when interfacing with Python, collaborating with internal and external teams to gather feedback and refine features, and educating users on best practices for leveraging ClickHouse in data science applications.

Skills

Python
Data Science
Machine Learning
SQL
Data Ingestion
Data Transformation
Data Analysis
Performance Optimization
Open-Source

Clickhouse

High-speed column-oriented database management system

About Clickhouse

ClickHouse provides a high-speed, column-oriented database management system designed for developers and businesses that manage large-scale data. Its primary product processes analytical queries quickly by storing data from the same columns together, making it significantly faster than traditional row-oriented databases, especially in Online Analytical Processing (OLAP) scenarios. ClickHouse stands out from competitors by offering a free, open-source database that can be deployed on local machines or in the cloud, along with a fully managed service on platforms like AWS, GCP, and Microsoft Azure. The company's goal is to deliver a cost-effective solution that simplifies data management for its clients, as evidenced by user feedback highlighting substantial cost savings.

San Francisco, CaliforniaHeadquarters
2021Year Founded
$291.8MTotal Funding
SERIES_BCompany Stage
Data & Analytics, Enterprise SoftwareIndustries
201-500Employees

Benefits

Health Insurance
Unlimited Paid Time Off
Flexible Work Hours
Remote Work Options
Stock Options
Home Office Stipend

Risks

Redpanda Serverless poses a competitive threat in real-time data processing.
Integration challenges with PeerDB may delay expected benefits.
Dependency on Supabase could pose operational risks.

Differentiation

ClickHouse's column-oriented design offers superior speed for analytical queries.
The open-source model allows flexible deployment across various environments.
Integration with Grafana enhances data visualization capabilities.

Upsides

Partnership with Alibaba Cloud boosts presence in the Chinese market.
Acquisition of PeerDB enhances real-time analytics capabilities.
Launch of ClickPipes improves data processing efficiency for real-time updates.

Land your dream remote job 3x faster with AI