Cloud Infrastructure Software Engineer
ClickhouseFull Time
Senior (5 to 8 years)
Candidates should have over 5 years of hands-on infrastructure or DevOps engineering experience, preferably in fast-paced startup environments. Strong experience with AWS or GCP, Kubernetes in production, Docker, and Helm is required. Proficiency with Terraform, scripting (e.g., Bash), and Python for automation is necessary. Comfort in reading and contributing to application code (Node.js, Python) and familiarity with security best practices and compliance standards (SOC 2, HIPAA, etc.) in cloud-native environments are essential. The ideal candidate thrives in high-ownership environments where priorities shift quickly, balancing speed with long-term reliability, and has experience working cross-functionally with developers, product teams, and customers. Experience in early- to mid-stage startups, especially those with AI/ML infrastructure or SaaS platforms, is preferred.
The Infrastructure Engineer will design, secure, and maintain cloud infrastructure powering production SaaS and ML workloads across AWS and/or GCP. They will build and operate scalable, containerized applications using Kubernetes, Helm, and Docker. Responsibilities include developing and managing infrastructure-as-code solutions using Terraform, Bash, and Python. The role involves working directly with customers and internal teams to meet security, compliance, and reliability requirements (SOC 2, HIPAA, GDPR). Additionally, they will improve observability, reliability, and on-call processes, including SLO/SLAs and incident response. Automating CI/CD workflows with tools like GitHub Actions and Spacelift and contributing code (Python, Node.js) to product features and platform infrastructure are key duties. Identifying and acting on cost-optimization opportunities across the tech stack is also expected.
Platform for creating and deploying AI models
Roboflow offers a platform for engineers to create, train, and deploy machine learning models using their own images and videos. The platform features an auto-annotate API for efficient data labeling, along with tools for preprocessing and augmenting image data. Roboflow distinguishes itself from competitors by providing project management tools that enhance team collaboration on AI projects. The company's goal is to simplify the AI development process for a diverse range of clients, from individual engineers to large organizations.