Site Reliability Engineer
Stitch FixFull Time
Mid-level (3 to 4 years)
The Senior Site Reliability Engineer should possess 7+ years of experience in SRE or infrastructure roles, improving production systems at scale, deep MySQL experience including schema design, performance tuning, and operational tooling, fluency in cloud-native tech (GCP a plus) and Terraform, proficiency in Go and Bash for scripting and systems programming, and skill in observability, incident response, and debugging distributed systems.
This role involves building and scaling infrastructure to support billions of messages per day and real-time events, automating deployments, alerting, and incident response, improving the on-call process through clear alerts, solid documentation, and faster resolution, tuning MySQL and other datastore performance and improving reliability across distributed systems, and collaborating across teams to debug, ship, and support systems in production, sharing knowledge and raising the bar through sharing progress publicly, and leveraging AI tools to prototype, move faster, and make better decisions.
Marketing automation for customer engagement
Customer.io is a marketing automation platform that helps businesses engage with their customers throughout their lifecycle. It allows companies to segment their audience based on real-time events, enabling personalized messaging that enhances customer connections and engagement. Unlike many competitors, Customer.io offers A/B testing and professional support services to optimize marketing strategies. The goal is to empower businesses to send data-driven messages that improve customer engagement and drive revenue.