Site Reliability Engineer
Stitch FixFull Time
Mid-level (3 to 4 years)
Key technologies and capabilities for this role
Common questions about this position
The role requires 7+ years of experience in software engineering with a strong focus on production systems and distributed architectures.
Skills that supercharge candidates include experience with distributed systems at scale, hands-on work with Kafka/Redpanda, PostgreSQL or other SQL databases, MongoDB/NoSQL, Clickhouse or other OLAP databases, deep understanding of release automation, CI/CD, code lifecycle management, and familiarity with gRPC.
Nominal has offices in Los Angeles, Austin, and New York City. The job description does not specify if the role is remote or requires office presence.
The team includes alumni from SpaceX, Meta, Palantir, Anduril, Lockheed Martin, and NASA, united by a mission to accelerate hardware innovation through faster, smarter testing. The culture emphasizes continuous improvement, with a focus on embedding learnings from incidents into tools, practices, and company-wide processes.
Strong candidates thrive in high-leverage roles improving how teams build, ship, and fix software, have led incident response and built systems for continuous improvement, and are excited by complexity while motivated to make systems safer and teams faster.
Software tools for engineering hardware systems
Nominal.io provides software tools designed specifically for engineering teams working with complex hardware systems. Their platform allows these teams to test and deploy hardware systems significantly faster than traditional methods, making it particularly beneficial for industries such as aerospace, defense, energy, and telecommunications, where hardware performance is critical. The platform consolidates data from various sources, enabling engineers to monitor and analyze their systems effectively in a secure environment. Unlike many competitors, Nominal.io focuses on a niche market with high demands for reliability, offering a software-as-a-service (SaaS) model that ensures clients have continuous access to the latest features. The company's goal is to enhance the resilience and performance of hardware systems, positioning itself as a key partner for engineering teams looking to improve their deployment processes.