Lead Site Reliability Engineer
KrakenFull Time
Expert & Leadership (9+ years)
Candidates should have 6+ years of experience in Site Reliability Engineering, DevOps, or related fields, with a focus on building scalable, resilient, and automated cloud-based systems. They must have hands-on experience designing, deploying, and optimizing production-grade, business-critical systems in cloud environments, and expertise with at least one major public cloud provider (AWS, Google Cloud, or Azure). Strong proficiency in infrastructure-as-code tools like Terraform, CI/CD pipeline design, Linux/Unix systems, networking fundamentals, distributed system architectures, and scripting/software development using Python, Bash, or Go are required. Experience with containerization and orchestration technologies such as Docker and Kubernetes, monitoring, logging, and observability tools, and implementing security best practices is also necessary.
The Senior Site Reliability Engineer will design, build, and optimize a scalable, highly resilient cloud platform, focusing on improving reliability, performance, and automation. Responsibilities include developing robust infrastructure, implementing intelligent monitoring systems, and driving continuous improvement initiatives. They will design, deploy, and optimize large-scale Aerospike cloud platform infrastructure, lead the development of automation and infrastructure-as-code solutions, and build and maintain monitoring, alerting, and observability implementations. The role involves leading incident response activities, conducting post-mortems, driving continuous improvement, designing and enforcing security best practices, collaborating with development teams, participating in 24/7 on-call rotation, establishing documentation standards, leading capacity planning and performance optimization, and mentoring junior engineers.
High-performance NoSQL database for real-time applications
Aerospike builds high-performance, scalable databases for real-time applications, primarily serving large enterprises in finance, telecommunications, e-commerce, and advertising technology. Its main product is a NoSQL database that can process millions of transactions per second with low latency, making it suitable for applications that require quick data access. Aerospike offers various deployment options, including on-premises, cloud-based, and hybrid environments, and supports container orchestration tools like Kubernetes and Docker for flexible deployment. Unlike many competitors, Aerospike focuses on providing advanced features such as cross-datacenter replication and strong consistency. The company's goal is to enable businesses to efficiently manage and access large volumes of data in real-time.