Staff Big Data Engineer
H1Full Time
Expert & Leadership (9+ years)
Candidates should possess at least 3 years of data engineering experience, including exposure to on-premise systems like Spark, Hadoop, and HDFS. A strong understanding of engineering best practices, proficiency in Python or Java/Scala, and knowledge of SQL with various database dialects are essential. Familiarity with CI/CD processes, software containerization, and stream processing frameworks is also required. Desirable qualifications include experience with architectural design, technical ownership, data governance, data lineage, and data quality initiatives.
The Data Engineer will be responsible for designing and building scalable data pipelines using tools such as Airflow, Spark, and Kafka. They will implement monitoring and alerting systems for data quality, and support data governance and lineage initiatives. The role involves collaborating with peers to enhance the shared data platform and identifying improvements for system reliability, maintainability, and performance.
Operates Wikipedia and free knowledge projects
The Wikimedia Foundation operates Wikipedia and other free knowledge projects, aiming to create a world where everyone can freely access and share knowledge. It provides a platform for users to read, contribute, and share content, while also supporting the volunteer communities that help maintain these projects. The foundation is funded by donations from individuals and institutions, emphasizing its nonprofit status. Unlike many other organizations, it focuses on making knowledge accessible to all without charge, advocating for policies that support free knowledge initiatives. Its goal is to empower individuals to contribute to and benefit from a collective pool of knowledge.