Lead Data Pipeline Engineer
Two Six TechnologiesFull Time
Junior (1 to 2 years)
Candidates should possess 4+ years of experience as a Data Engineer, Analytics Engineer, or similar role with significant experience building and maintaining production data pipelines. Expert proficiency in SQL for complex data transformations, performance optimization, and working with large datasets is required, along with strong PostgreSQL experience being highly preferred. Proficiency in Python or another programming language for data pipeline development, automation, and scripting is also necessary.
The Data Engineer will design, build, and maintain scalable data pipelines and ETL/ELT processes to ingest, transform, and deliver data from various sources including application databases, event streams, and third-party APIs. They will architect and optimize data warehouse solutions, ensuring efficient storage, retrieval, and processing of large-scale time-series and analytical datasets. The role involves implementing and maintaining data quality frameworks, monitoring systems, and alerting mechanisms to ensure data accuracy, completeness, and reliability across all data systems. Collaboration with Product Managers, Marketing, Finance, and Sales is crucial to understand data requirements and build infrastructure that enables self-service analytics and advanced data exploration. Furthermore, the Data Engineer will optimize database performance, including query optimization, indexing strategies, and capacity planning, and work closely with Engineering teams to implement event tracking, logging, and instrumentation. Support for real-time data processing requirements and streaming analytics use cases, leveraging Timescale's time-series capabilities, is also part of the responsibilities, alongside championing data engineering best practices and maintaining data documentation, schemas, and governance processes.
Time series data management and analytics
Timescale specializes in managing time series data through its main product, TimescaleDB, which is an open-source database designed to efficiently handle large volumes of data points collected over time. Built on PostgreSQL, TimescaleDB offers reliable performance and operational efficiency. The company serves various industries, including IoT and financial services, enabling clients to analyze and gain insights from time series data for improved decision-making and automation. Timescale differentiates itself by providing both on-premise and cloud-based solutions, along with a freemium model that allows users to access the core product for free while offering premium features and enterprise support for revenue. The goal of Timescale is to enhance how businesses manage and analyze time series data, making it easier and more effective for organizations to leverage their data.