Staff Software Engineer
UpkeepFull Time
Mid-level (3 to 4 years), Senior (5 to 8 years)
Candidates must possess a minimum of 12 years of professional software development experience, with significant leadership experience in driving large-scale, cross-team projects. Expertise is required in Java, AWS cloud services (ECS/EKS, Lambda, DynamoDB, API Gateway), Kubernetes, Node.js, TypeScript, RESTful APIs, microservices architecture, and DevOps methodologies including infrastructure-as-code (Terraform/CloudFormation), CI/CD tooling (GitHub Actions/Jenkins), observability monitoring (Datadog/Prometheus), and containerization/orchestration (Docker/Kubernetes). Experience with large systems or event-driven platforms at an enterprise scale is essential, as is a proven track record of delivering strategic cross-domain technical initiatives and influencing engineering direction. Willingness to participate in on-call rotations for mission-critical systems is also necessary. Preferred qualifications include deep expertise in complex workflow automation systems, defining system-level SLOs and performance metrics, and industry thought leadership.
The Staff Software Engineer V will define and lead the technical vision, architecture, and strategic direction for PagerDuty Workflow Automation, driving cross-domain initiatives and aligning them with company goals. They will architect and deliver complex technical solutions across multiple engineering teams, ensuring scalability, reliability, maintainability, and performance. Responsibilities include collaborating with leadership and stakeholders to identify critical technology investments, acting as a technical leader and subject matter expert in SaaS development, workflow automation, and cloud-native architectures, and mentoring senior engineers. The role involves owning the end-to-end lifecycle of technology initiatives, from inception to operational excellence, and driving continuous improvement in engineering practices, processes, and tools to enhance productivity, system reliability, and customer experience.
Incident management and response platform
PagerDuty specializes in incident management and response, providing a platform that helps organizations quickly address IT issues to minimize operational disruptions. The platform integrates with various monitoring tools to detect incidents in real-time, alerting the right personnel for swift action. This process aids in reducing downtime and maintaining service quality across sectors like technology, finance, healthcare, and retail. PagerDuty operates on a subscription-based model, offering different pricing tiers based on user count and feature levels, which ensures a steady revenue stream. The company also provides premium support and professional services, enhancing its offerings. Overall, PagerDuty aims to help organizations efficiently manage and resolve IT incidents, ensuring the reliability of their digital services.