Ditto

Staff Site Reliability Engineer, US

United States

Not SpecifiedCompensation
Expert & Leadership (9+ years)Experience Level
Full TimeJob Type
UnknownVisa
Biotechnology, Software, Data Management, Edge ComputingIndustries

Staff Site Reliability Engineer

About Ditto:

Ditto is redefining how data moves at the edge. Our mission is to make it seamless for developers to build resilient, real-time applications, regardless of network conditions. Whether you're in a stadium, airplane, or remote military base, Ditto's peer-to-peer sync engine ensures devices stay connected and data stays consistent, even without internet. With more than $145 million in funding and trusted by organizations like Chick-fil-A, Delta Airlines, and the U.S. military, Ditto powers mission-critical experiences across aviation, retail, travel, hospitality, defense, and more. As a globally distributed, fast-growing startup, we’re committed to building a diverse and inclusive team that reflects the wide range of perspectives needed to solve the world’s hardest connectivity problems.

About the Position:

Ditto is at an inflection point. As we scale to meet the demands of our enterprise customers, we need experienced Site Reliability Engineers to ensure our infrastructure delivers enterprise-grade reliability. This is a unique opportunity to join a specialized team focused on observability, system reliability, and operational excellence for our cutting-edge, edge-to-cloud, database technology.

As a Staff Site Reliability Engineer, you will play a crucial role in ensuring the reliability, performance, and scalability of Ditto's cloud infrastructure. You'll collaborate with product engineering teams to improve system resilience, lead and develop incident management processes, and build observability solutions for our unique distributed architecture.

Responsibilities:

  • Develop and maintain observability solutions using platforms like Datadog, Prometheus, and Grafana.
  • Take a leading role in incident management, including coordinating response efforts, troubleshooting issues, and identifying follow-up actions.
  • Partner with product engineering teams to architect reliable systems, recover from incidents, and learn from mistakes.
  • Work with teams to implement and maintain SLOs, monitoring, and alerting strategies that ensure reliability at scale.
  • Design and implement automation and support tooling to improve system resilience, maintain operational safety, and reduce operational overhead.
  • Lead the development and maintenance of runbooks, alert definitions, and incident response procedures.
  • Participate in on-call rotations to provide 24/7 support for critical production systems.

Requirements:

  • 8+ years of experience in Site Reliability Engineering or similar DevOps roles focused on system reliability and incident management.
  • 6+ years of hands-on experience architecting applications for Kubernetes, and managing Kubernetes infrastructure.
  • Strong experience with modern monitoring stacks including Prometheus, Grafana, and Datadog.
  • Experience in at least one systems programming language, such as Go, Rust, C, or Java.
  • Expertise with Infrastructure as Code tools, like Terraform and Helm.
  • Expertise with at least one major cloud service provider (AWS, GCP, Azure).
  • Strong communication skills, with the ability to lead incident response and effectively collaborate across teams.
  • Willingness and experience engaging with on-call rotations and emergency response procedures.
  • A high degree of agency and bias towards action. Identify problems and work autonomously to solve them.
  • Excellent problem-solving skills and a methodical approach to troubleshooting complex issues.

Nice to Have:

  • Experience building multi-tenant, multi-cloud SaaS/DBaaS Platforms.
  • Knowledge of edge computing or mesh networking.
  • Experience instrumenting advanced observability practices (tracing, profiling) in distributed systems.
  • Experience working with globally distributed teams.
  • Proven experience in project management.

Benefits:

We offer competitive salaries and meaningful equity. We believe everyone on the team should have a stake in what we’re building. Benefits vary by region to make sure you're covered in the ways that matter most.

Skills

Site Reliability Engineering
Observability
System Reliability
Operational Excellence
Cloud Infrastructure
Incident Management
Datadog
Prometheus
Grafana
System Resilience
Troubleshooting
Distributed Architecture

Ditto

Simplifies multi-platform app development and synchronization

About Ditto

Ditto.live simplifies the development of native applications for various platforms, including iOS, macOS, Android, and web. Its main product, the Edge Sync Platform, addresses the challenge of data synchronization by allowing developers to manage data that is distributed across multiple devices and cloud infrastructures. This platform enables developers to write their code once and deploy it across different platforms, which saves time and reduces effort in the app development process. Unlike many competitors, Ditto focuses on providing a seamless experience for developers by offering features like peer-to-peer authentication and offline syncing. The company's goal is to enhance the efficiency of app development and improve user experiences by enabling the creation of interconnected applications.

San Francisco, CaliforniaHeadquarters
2018Year Founded
$52.5MTotal Funding
SERIES_ACompany Stage
Data & Analytics, Consumer Software, Enterprise SoftwareIndustries
51-200Employees

Benefits

Health Insurance
Dental Insurance
Vision Insurance
Life Insurance
Disability Insurance
Flexible Spending Account/Flexible Spending Account
Unlimited Paid Time Off
401(k) Retirement Plan
Stock Options

Risks

Emerging startups may dilute Ditto's market share with similar solutions.
Rapid app framework evolution could outpace Ditto's integration capabilities.
Economic downturns may challenge Ditto's subscription-based revenue model.

Differentiation

Ditto offers real-time data sync without internet, unlike many competitors.
Their Edge Sync Platform supports both iOS and Android, reducing development time.
Ditto's peer-to-peer authentication enhances data privacy and security.

Upsides

Growing demand for edge computing boosts Ditto's market potential.
Offline-first app development trend aligns with Ditto's core capabilities.
5G expansion enhances Ditto's real-time data synchronization benefits.

Land your dream remote job 3x faster with AI