Candidates must have 3+ years of experience developing data pipelines using cloud-managed Spark clusters, fluency in Python or Java and Spark, previous experience building tools and libraries to automate and streamline data processing workflows, proficiency with SQL/SparkSQL, and hands-on experience working with a Data Lakehouse. Good verbal and written communication skills and proven experience working and delivering in an Agile environment are also required. Experience with DevOps pipelines, orchestration tools like Airflow, AWS services for data processing, and the Life Sciences sector are considered nice to have.
The Data Engineer will be responsible for the OpenData data processing workflows in the US, building and maintaining data processing tools, pipelines, and reports, and ensuring data quality in the reference data. They will also develop algorithms to build complex data relationships, build analytical data structures to support reporting, build and maintain Data Quality processes, and collaborate with the Product team to adapt reference data to changing market demands.
Quality and regulatory software solutions provider
Veeva Systems offers software solutions for quality, regulatory, and advertising claims management, focusing on consumer products and chemical companies. Their cloud-based platform provides visibility and traceability throughout the product journey, ensuring compliance with regulations and accelerating time-to-market. Unlike competitors, Veeva has specialized expertise in both the Life Sciences and Chemical sectors, allowing them to effectively address industry-specific challenges. The company's goal is to help clients efficiently bring safe and compliant products to market.