Senior Site Reliability Engineer
Chainlink LabsFull Time
Senior (5 to 8 years)
Candidates must possess 8+ years of experience in Site Reliability Engineering, DevOps, or related fields, with a focus on architecting scalable, resilient, and automated enterprise-scale systems. Experience leading complex infrastructure projects that drive measurable improvements in system reliability and performance is required. Deep knowledge of multiple public cloud providers (AWS, Google Cloud, Azure), including advanced cloud-native services and architectures, is essential. Advanced proficiency in automation, tooling, and infrastructure solutions for enterprise-scale automated and reproducible infrastructure is necessary. Extensive experience in CI/CD pipeline design and implementation for seamless, automated software delivery and infrastructure updates at scale is a must. A deep understanding of Linux/Unix systems, advanced networking concepts, and distributed system architectures is required. Comprehensive proficiency in scripting and software development using Python, Bash, Go, or similar languages for building sophisticated automation, tooling, and infrastructure solutions is also essential.
The Staff Site Reliability Engineer will architect, build, and optimize enterprise-scale, highly resilient cloud platform infrastructure and services. They will focus on establishing reliability, performance, and automation standards to ensure seamless delivery and operation across the cloud platform ecosystem. Responsibilities include driving robust infrastructure initiatives across multiple teams, implementing organization-wide monitoring and observability practices, and leading strategic improvement initiatives to enhance system efficiency, scalability, and overall platform stability at enterprise scale. This includes leading complex incident response activities, conducting root cause analysis, establishing security best practices, collaborating with development teams, serving as an escalation point for critical production incidents, establishing documentation standards, leading capacity planning and performance optimization efforts, and mentoring engineers across teams.
High-performance NoSQL database for real-time applications
Aerospike builds high-performance, scalable databases for real-time applications, primarily serving large enterprises in finance, telecommunications, e-commerce, and advertising technology. Its main product is a NoSQL database that can process millions of transactions per second with low latency, making it suitable for applications that require quick data access. Aerospike offers various deployment options, including on-premises, cloud-based, and hybrid environments, and supports container orchestration tools like Kubernetes and Docker for flexible deployment. Unlike many competitors, Aerospike focuses on providing advanced features such as cross-datacenter replication and strong consistency. The company's goal is to enable businesses to efficiently manage and access large volumes of data in real-time.