Big Data Engineer
ZoomFull Time
Junior (1 to 2 years), Mid-level (3 to 4 years)
Candidates should have 2+ years of experience as a Data Engineer or Software Backend Engineer, with strong programming skills in Python, Scala, or Java. Experience with NoSQL columnar stores like HBase, distributed data systems such as Spark, Kafka, or Flink, and writing complex SQL queries is required. Familiarity with workflow orchestration tools and data modeling techniques is also expected.
The Data Engineer will design, build, and maintain data pipelines, ETL/ELT workflows, and scalable microservices using Java, Python, and Spring Batch. Responsibilities include developing complex web scraping and real-time pipelines, implementing frontends for data workflows, and deploying services through CI/CD pipelines into AWS ECS/Fargate. The role also involves ensuring services are monitorable, debuggable, and reliable, architecting data storage models, building advanced data search capabilities, applying ML techniques, and documenting system architecture with diagrams.
Manages and secures open-source software usage
Sonatype helps organizations manage and secure their use of open-source software, which is software that anyone can inspect and modify. Their main product, the Nexus Platform, automates DevOps processes and governs the usage of open-source software. This platform supports practices that combine software development and IT operations to speed up the development lifecycle and ensure high-quality software delivery. Sonatype serves a variety of clients, including IT leaders and developers across different industries, such as healthcare. Unlike many competitors, Sonatype offers both free and paid versions of their products, allowing users to manage software components effectively. Their goal is to provide tools that enhance software security and efficiency in development, generating revenue through subscriptions to their advanced features.