Senior Cloud Data Infrastructure Engineer
ClickhouseFull Time
Senior (5 to 8 years), Expert & Leadership (9+ years)
Candidates must have 3+ years of experience developing data pipelines using cloud-managed Spark clusters like AWS EMR or Databricks. Fluency in Python or Java and Spark, along with previous experience building tools and libraries to automate data processing workflows, is required. Proficiency with SQL/SparkSQL, hands-on experience with a Data Lakehouse, and proven experience working and delivering in an Agile environment are also necessary. Nice-to-have qualifications include experience running data workflows through DevOps pipelines, developing data pipelines with orchestration tools like Airflow, experience with AWS services for data processing, and previous experience in the Life Sciences sector.
The Data Engineer will be responsible for the OpenData data processing workflows in the US, building and maintaining data processing tools, pipelines, and reports to ensure data quality in reference data. They will develop algorithms to build complex data relationships, build analytical data structures to support reporting, and build and maintain Data Quality processes. Additionally, the role involves collaborating with the Product team to adapt reference data to changing market demands.
Quality and regulatory software solutions provider
Veeva Systems offers software solutions for quality, regulatory, and advertising claims management, focusing on consumer products and chemical companies. Their cloud-based platform provides visibility and traceability throughout the product journey, ensuring compliance with regulations and accelerating time-to-market. Unlike competitors, Veeva has specialized expertise in both the Life Sciences and Chemical sectors, allowing them to effectively address industry-specific challenges. The company's goal is to help clients efficiently bring safe and compliant products to market.