Staff System Reliability Engineer V
Employment Type: Full-time
Position Overview
iRhythm is seeking a highly experienced and strategic Staff System Reliability Engineer V to lead the design, scalability, and resilience of our cloud infrastructure. This role is ideal for someone with deep expertise in AWS, infrastructure automation, and observability who thrives in complex, high-availability environments. As a senior technical leader, you’ll work closely with engineering and security teams to optimize performance, improve deployment pipelines, and uphold service reliability across mission-critical systems. At iRhythm, you’ll have the opportunity to grow your skills and your career while impacting the lives of people around the world, shaping a future where everyone, everywhere can access the best possible cardiac health solutions.
Responsibilities
- Design and implement scalable, fault-tolerant AWS-based infrastructure using Terraform and/or CloudFormation for regulated workloads (e.g., HIPAA, FDA CFR Part 11, EU MDR).
- Develop and maintain CI/CD pipelines using tools like GitLab CI, ArgoCD, or similar.
- Write automation tools and scripts in Python and/or Go to support operations, monitoring, and self-healing systems.
- Lead incident response efforts, root cause analysis, and postmortem documentation for system failures.
- GitLab pipeline authoring.
- Kubernetes (EKS) cluster management support.
- Migrate applications from ELB/ALB EC2 instances to k8s using Helm for configuration management.
- Define and monitor SLOs, SLAs, and error budgets across key services.
- Implement and manage observability tools (e.g., Prometheus, Grafana, CloudWatch, OpenTelemetry).
- Collaborate with software engineers to ensure systems are designed for reliability and security from the ground up.
- Harden system security by implementing least privilege IAM, automated patching, and vulnerability management.
- Evaluate and onboard new technologies to improve infrastructure efficiency and resilience.
- Mentor junior SREs and promote best practices in reliability engineering across the organization.
Requirements
- Experience: Minimum of 8 years of related experience with a Bachelor's degree; or 6 years and a Master's degree; or equivalent work experience.
- AWS Expertise: Expert-level knowledge of AWS services (EC2, Lambda, VPC, IAM, RDS, ECS/EKS, etc.).
- CI/CD & GitOps: Helm, Argo CD, GitLab (ability to abstract complexity to templated pipeline archetypes for similar development projects). Deep understanding of infrastructure-as-code and GitOps workflows.
- Automation: Strong proficiency in Python and/or Go for automation and tooling.
- Observability: Experience managing observability and alerting systems at scale.
- Systems & Architecture: Strong grasp of Linux systems, networking, and distributed architecture principles.
- Regulatory Familiarity: Familiarity with regulatory requirements such as FDA 21 CFR Part 11, HIPAA, ISO 13485, and EU MDR as they relate to infrastructure and DevOps.
- Communication: Strong written and verbal communication skills, including documentation and incident reporting.
Company Information
At iRhythm, you’ll have the opportunity to grow your skills and your career while impacting the lives of people around the world. iRhythm is shaping a future where everyone, everywhere can access the best possible cardiac health solutions. Every day, we collaborate, create, and constantly reimagine what’s possible. We think big and move fast, driven by our commitment to put patients first and improve lives. We need builders like you. Curious and innovative problem solvers looking for the chance to meaningfully shape the future of cardiac health, our company, and your career.
What’s In It for You
- Competitive Compensation: Includes base salary, annual performance bonus, and stock/equity opportunities.
- Outstanding Benefits: Comprehensive medical, dental, vision, and wellness programs.
- Generous Paid Time Off: Vacation, holidays, and sick leave to support work/life balance.
- Flexible Work Options: Hybrid and remote arrangements, depending on your location.
- Financial Wellness: 401(k) with company match and financial wellness resources.
Work Environment / Other Requirements
- Occasional travel to office if in the Bay area.