Illustration of a computer and code

Remote Site Reliability Engineer Jobs

Browse a wide range of remote Site Reliability Engineer positions available globally. New jobs added frequently.

Share on:
United States
Remote iconRemote

Senior Site Reliability Engineer

Flock Safety

Candidates must possess senior-level experience in a Site Reliability Engineering role, with a strong understanding of monitoring, troubleshooting, and disaster recovery. Extensive experience in writing production-quality code, preferably in Go, Python, or Typescript, is required. Proficiency with infrastructure as code and configuration management tools such as Terraform is essential. Experience managing monitoring dashboards with tools like Grafana and Prometheus to create actionable alerts is…

  • Compensation icon$171,000 - $190,000/year
  • Employment type iconFull Time
  • Experience level iconSenior (5 to 8 years), Expert & Leadership (9+ years)
United States
Remote iconRemote

Senior Site Reliability Engineer

Red Cell Partners

Candidates should have proven Site Reliability Engineering and DevOps experience, with demonstrated experience in managing complex, large-scale production environments. Experience with cloud platforms such as GCP, AWS, or Azure is required, along with expertise in automation tools and frameworks, monitoring solutions like Prometheus, Grafana, and the ELK stack, and CI/CD pipelines for machine learning models.

  • Compensation iconSalary not specified
  • Employment type iconFull Time
  • Experience level iconSenior (5 to 8 years)
Boston
Remote iconRemote

Site Reliability Engineer

Hometap

The Site Reliability Engineer should have 3+ years of experience in Site Reliability Engineering (SRE), DevOps, or a similar role, 1+ years of hands-on experience with AWS services including ECS, EKS, CloudWatch, Lambda, CloudFront, S3, and DynamoDB, 1+ years of experience with observability and monitoring tools such as CloudWatch, Sentry, Grafana, or Prometheus, and basic proficiency in Terraform for infrastructure-as-code implementation. Strong troubleshooting and problem-solving skills, along…

  • Compensation iconSalary not specified
  • Employment type iconFull Time
  • Experience level iconJunior (1 to 2 years)
United States
Remote iconRemote

Principal Site Reliability Engineer

General Motors

Candidates must possess strong programming skills in at least one language such as Python, Go, or Java, and be familiar with multiple language ecosystems. A solid understanding of operating systems, networking, distributed systems, databases, and storage architectures is required, along with a deep comprehension of how code runs on underlying hardware, including operating systems and algorithms.

  • Compensation iconSalary not specified
  • Employment type iconFull Time
  • Experience level iconExpert & Leadership (9+ years)
United States
Remote iconRemote

Software Engineer, SRE (Staff/Senior Levels)

Kustomer

Candidates should possess a Bachelor’s degree in Computer Science or a related field, and have at least 3 years of experience in software engineering, with a preference for experience in Site Reliability Engineering. Strong experience with cloud infrastructure, particularly AWS, is required, along with a solid understanding of CI/CD processes and automation tools. A passion for AI and automation technologies is highly desired.

  • Compensation iconSalary not specified
  • Employment type iconFull Time
  • Experience level iconExpert & Leadership (9+ years), Senior (5 to 8 years)
United States
Remote iconRemote

Staff Software Engineer - SRE, Backend (Reliability Engineering)

Affirm

Candidates should have 7+ years of experience designing, developing, and launching backend systems at scale using languages like Python or Kotlin, 7+ years of experience in a Site Reliability or Production Engineering team, and a Bachelor’s degree in a related field or equivalent practical experience. They should demonstrate curiosity with empathy and strong opinions loosely held, and experience delivering major features, system components, or deprecating existing functionality through the defin…

  • Compensation iconSalary not specified
  • Employment type iconFull Time
  • Experience level iconSenior (5 to 8 years)
United States
Remote iconRemote

Site Reliability Engineer

Close

Candidates should have 5+ years of experience building modern infrastructure systems for Senior 1 & 2 level roles, and 8+ years for Staff level roles, with experience as the final point of escalation in the support of mission critical production systems. Familiarity with AWS, Terraform, CircleCI, GitHub Actions, ArgoCD, Ansible, Elasticsearch, MongoDB, PostgreSQL, ClickHouse, Kubernetes, Loki, Tempo, Grafana, Mimir/Prometheus, and Argo Workflow is required.

  • Compensation iconSalary not specified
  • Employment type iconFull Time
  • Experience level iconJunior (1 to 2 years)
United States
Remote iconRemote

Staff Site Reliability Engineer

Aerospike

Candidates should possess 8+ years of experience in SRE, DevOps, or infrastructure engineering with significant time operating production systems at scale. They must have deep hands-on experience with at least one major public cloud (AWS, GCP, Azure) and working knowledge of others, with Azure experience being a plus. Production experience with Kubernetes, including operating clusters, Helm, operators, and supporting microservices, is required, along with strong proficiency in infrastructure-as-…

  • Compensation iconSalary not specified
  • Employment type iconFull Time
  • Experience level iconExpert & Leadership (9+ years)
New York
Remote iconRemote

Site Reliability Engineer

Superblocks

Candidates must have 3+ years of experience managing cloud-based production applications with deep knowledge of containers, VMs, caches, task queues, networking, and OS. They should have experience designing and deploying infrastructure in production at scale using containerized solutions like Docker, Kubernetes, ECS/EKS, or Firecracker. A strong product sense focused on great user experiences and strategic thinking to meet market and customer needs is also required. Experience building and oper…

  • Compensation iconSalary not specified
  • Employment type iconFull Time
  • Experience level iconMid-level (3 to 4 years)
San Francisco +7 more
Remote iconRemote

Senior Site Reliability Engineer

Chainlink Labs

Candidates should possess at least 8 years of relevant professional experience, ideally with a background in DevOps, infrastructure, SRE, or platform teams. A strong DevOps mentality, experience building and maturing a GitOps environment, and proficiency in software development beyond typical infrastructure configurations are essential. Demonstrable skills in shell scripting and at least one higher-level programming language, excellent Linux understanding, and expertise in designing, deploying, …

  • Compensation iconSalary not specified
  • Employment type iconFull Time
  • Experience level iconSenior (5 to 8 years)
New York
Remote iconRemote

SRE Tech Lead Manager

Oura

Candidates should have over 7 years of backend development experience, with at least 2 years managing or leading infrastructure-focused teams. A passion for building inclusive, high-performing teams, excellent communication and decision-making skills, and experience designing and building data-intensive distributed systems in production are essential. Experience designing for scale and growth, particularly fault-tolerant and secure systems, along with strong experience running, monitoring, and d…

  • Compensation iconSalary not specified
  • Employment type iconFull Time
  • Experience level iconExpert & Leadership (9+ years)
United States
Remote iconRemote

Senior Site Reliability Engineer, Devices

Flock Safety

Candidates should possess strong coding skills in languages such as Python, R, JS, Java, or Groovy, and have a solid understanding of common algorithms. Experience with software development workflows including continuous integration and test automation, along with tools like Git, Jenkins, and GitHub Actions, is essential. Proficiency in SQL databases (e.g., PostgreSQL), NoSQL, and Time Series databases (e.g., Prometheus, DataDog) is required, as is experience with volume data processing, data vi…

  • Compensation icon$150,000 - $190,000/year
  • Employment type iconFull Time
  • Experience level iconSenior (5 to 8 years)
United States
Remote iconRemote

Senior Site Reliability Engineer

Calendly

Candidates must have a strong understanding of the Linux operating system and possess strong technical knowledge of cloud infrastructure, particularly GCP, distributed systems, and reliability practices. Deep experience is required in designing, building, and running highly-available production infrastructure, along with strong Golang or Python development experience, especially writing APIs for cloud infrastructure management. Solid working knowledge of patterns and principles for designing and…

  • Compensation iconSalary not specified
  • Employment type iconFull Time
  • Experience level iconSenior (5 to 8 years)
United States
Remote iconRemote

Staff Systems Reliability Engineer

iRhythm Technologies

The Staff Systems Reliability Engineer V requires a minimum of 8 years of related experience with a Bachelor’s degree; or 6 years and a Master’s degree; or equivalent work experience. Candidates should possess expert-level knowledge of AWS services such as EC2, Lambda, VPC, IAM, RDS, ECS/EKS, and familiarity with regulatory requirements like FDA 21 CFR Part 11, HIPAA, ISO 13485, and EU MDR. Strong proficiency in Python and/or Go for automation and tooling, as well as experience with Helm, Argo C…

  • Compensation iconSalary not specified
  • Employment type iconFull Time
  • Experience level iconExpert & Leadership (9+ years)
United States
Remote iconRemote

Senior Engineering Manager, Site Reliability

Ditto

Candidates should have experience leading and scaling a globally distributed SRE organization, including managers and individual contributors. Proven ability to develop engineering leaders and senior talent through coaching is essential. Experience in driving adoption of SRE best practices, establishing incident management practices, and leading the architecture and execution of observability systems is required. Familiarity with defining and implementing SLIs, SLOs, and SLAs, along with experie…

  • Compensation iconSalary not specified
  • Employment type iconFull Time
  • Experience level iconExpert & Leadership (9+ years)

Get Started Today

Land your dream remote job 3x faster with AI