Senior Software Engineer, Data
Flex- Full Time
- Senior (5 to 8 years)
Candidates should have at least 8 years of experience in designing and building scalable, distributed systems. Strong programming skills in Java, Scala, or C++ are required, with a focus on performance and reliability. A deep understanding of distributed transaction processing, concurrency control, and high-performance query engines is essential. Experience with open-source data lake formats like Apache Iceberg, Parquet, or Delta and familiarity with cloud-native services and public cloud providers such as AWS, Azure, or GCP is also required. A passion for open-source software and community engagement in the data ecosystem is necessary, along with knowledge of data governance, security, and access control models in distributed data systems.
The Senior Software Engineer will design and implement scalable, distributed systems to support Iceberg DML/DDL transactions, schema evolution, partitioning, and time travel. They will architect and build systems that integrate Snowflake queries with external Iceberg catalogs, enabling interoperability across cloud providers. The role includes developing high-performance solutions for catalog federation, collaborating with the open-source team and Apache Iceberg community, and working on core data access control and governance features for Polaris. The engineer will also contribute to the managed Polaris service, build tooling for data lake table maintenance, and ensure efficient query performance.
Data management and analytics platform
Snowflake provides a platform called the AI Data Cloud that helps organizations manage and analyze their data. This platform allows users to store and process large amounts of data efficiently, offering services like data warehousing, data lakes, data engineering, data science, and data sharing. Snowflake's system works by uniting data from different sources, enabling secure sharing and performing various types of analytics. What sets Snowflake apart from its competitors is its ability to operate seamlessly across multiple public clouds, allowing users to access their data from anywhere. The company's goal is to help businesses leverage their data for better decision-making by providing a flexible subscription-based service that scales according to their needs.