Staff Site Reliability Engineer at Aerospike

Australia

Aerospike Logo
Not SpecifiedCompensation
Senior (5 to 8 years), Expert & Leadership (9+ years)Experience Level
Full TimeJob Type
UnknownVisa
Technology, DatabaseIndustries

Requirements

  • 8+ years of experience in SRE, DevOps, or infrastructure engineering, including significant time operating production systems at scale
  • Deep hands-on experience with at least one major public cloud (AWS, GCP, Azure), and working knowledge of the others; Azure experience is a plus
  • Production experience with Kubernetes, including operating clusters, Helm, operators, and supporting microservices in real-world environments
  • Strong proficiency in infrastructure-as-code tools such as Terraform and CI/CD automation platforms
  • Expertise in observability tools and practices (Datadog, Prometheus, Grafana, ELK, etc.) and using them to define SLIs and SLOs; DataDog experience is a plus
  • Programming and scripting ability in one or more languages (Python, Go, Bash, etc.)
  • Experience with large-scale incident response and post-incident review practices
  • Proven ability to mentor other engineers and influence technical strategy across multiple teams
  • Strong communication skills to articulate complex concepts to technical and non-technical stakeholders

Responsibilities

  • Provide technical leadership across multiple systems and environments, proactively identifying risks, shaping architecture decisions, and improving reliability and performance at scale
  • Lead key infrastructure efforts including Kubernetes platform expansion (AKS, AKO), and application of SRE principles to legacy systems and new cloud offerings
  • Define, measure, and enforce reliability standards through SLIs/SLOs, observability tooling, and incident response frameworks
  • Mentor and guide other SREs by leading design sessions, conducting technical deep dives, and reviewing code, configurations, and infrastructure decisions
  • Partner with product, engineering, and cloud teams to align reliability goals with delivery objectives
  • Lead root cause analyses and implement systemic fixes for issues spanning multiple platforms or services
  • Drive automation-first approaches using IaC, CI/CD pipelines, and scripting to reduce toil and increase deployment confidence
  • Influence cross-functional roadmaps, identifying areas for innovation, technical debt reduction, and long-term scalability
  • Participate in the global on-call rotation, bringing senior-level calm and clarity during incidents and escalations

Skills

Key technologies and capabilities for this role

KubernetesAKSAerospike Kubernetes OperatorAWSGCPSREmulti-cloudhybrid cloud

Questions & Answers

Common questions about this position

What experience level is required for the Staff Site Reliability Engineer role?

The role requires 8+ years of experience in SRE, DevOps, or infrastructure.

What are the key technical skills needed for this position?

Key skills include experience with Kubernetes platforms like AKS and the Aerospike Kubernetes Operator, AWS and GCP services, SLIs/SLOs, observability tooling, IaC, CI/CD pipelines, and automation scripting.

Is this a remote position or does it require office work?

This information is not specified in the job description.

What is the compensation or salary for this role?

This information is not specified in the job description.

What does the company culture emphasize for SREs?

The culture emphasizes technical leadership, mentoring others, fostering ownership, resilience, continuous improvement, and applying modern SRE practices across teams and platforms.

Aerospike

High-performance NoSQL database for real-time applications

About Aerospike

Aerospike builds high-performance, scalable databases for real-time applications, primarily serving large enterprises in finance, telecommunications, e-commerce, and advertising technology. Its main product is a NoSQL database that can process millions of transactions per second with low latency, making it suitable for applications that require quick data access. Aerospike offers various deployment options, including on-premises, cloud-based, and hybrid environments, and supports container orchestration tools like Kubernetes and Docker for flexible deployment. Unlike many competitors, Aerospike focuses on providing advanced features such as cross-datacenter replication and strong consistency. The company's goal is to enable businesses to efficiently manage and access large volumes of data in real-time.

Mountain View, CaliforniaHeadquarters
2009Year Founded
$195.5MTotal Funding
LATE_VCCompany Stage
Data & Analytics, Enterprise SoftwareIndustries
201-500Employees

Benefits

Health Insurance
Paid Vacation
Professional Development Budget
Mental Health Support

Risks

Emerging NoSQL providers offering similar solutions at lower costs threaten Aerospike's market share.
Rapid AI evolution may strain Aerospike's resources to innovate continuously.
Cloud-native database trends challenge Aerospike's traditional deployment models.

Differentiation

Aerospike offers sub-millisecond latency for real-time data processing applications.
The company supports hybrid cloud environments, enhancing deployment flexibility for enterprises.
Aerospike's NoSQL database handles millions of transactions per second efficiently.

Upsides

Increased demand for real-time data processing boosts Aerospike's adoption in the financial sector.
The rise of edge computing positions Aerospike well for IoT and edge applications.
Expansion in digital payment systems offers growth opportunities in emerging markets.

Land your dream remote job 3x faster with AI