Job Description: Cloud Site Reliability Engineer (SRE)
Position Overview
- Company: Promise
- Employment Type: Full-Time
- Salary: $149K - $195K
- Location Type: (Not specified)
Description:
Promise empowers utilities and government agencies to create flexible, affordable solutions for individuals struggling with debt. We’re looking for a Cloud Site Reliability Engineer (SRE) to build, operate, and optimize the infrastructure that powers our products. You’ll be responsible for ensuring high reliability, performance, and scalability of our cloud-based systems. The ideal candidate is self-sufficient, detail-oriented, and execution-driven, with a strong background in software development, site reliability engineering (SRE), and infrastructure-as-code (IaC). You’ll collaborate closely with product and engineering teams to improve system architecture, troubleshoot issues, and automate operational processes. This role is ideal for someone who thrives in a hard-working, fast-moving environment, enjoys solving complex technical challenges, and takes personal responsibility for ensuring security outcomes are achieved and aligned to business goals.
Requirements
- 4+ years of experience in Linux system administration, managing large-scale production environments.
- Strong debugging skills, with experience in performance tuning, observability tools, stack traces, and system logs.
- Experience with infrastructure-as-code (IaC) tools (e.g., Terraform).
- Scripting skills (e.g., Python, Bash).
- Experience with configuration management tools.
- Understanding of security best practices and compliance requirements.
Responsibilities
- Design, implement, and manage cloud infrastructure to ensure reliability, scalability, and security.
- Automate infrastructure and operations using Terraform, scripting, and configuration management tools.
- Develop strong relationships with engineering teams to define system reliability goals and best practices.
- Troubleshoot and resolve complex network and system issues.
- Monitor and optimize system performance, implementing best practices for high availability and disaster recovery.
- Formalize and liaise with the Engineering team to guide them through a security design review process.
- Ensure the security and stability of Linux-based production systems.
- Provide essential support in aligning our technology projects with compliance requirements, navigating the complexities of state and federal regulations, while fostering an environment of innovation.
- Serve as a bridge between technical teams and non-technical stakeholders, translating security and compliance needs into actionable plans that support our broader business objectives.
Company Information
- Company Overview: Promise empowers utilities and government agencies to create flexible, affordable solutions for individuals struggling with debt. Our innovative approach to payment plans and relief distribution significantly improves enrollment and recovery rates, helping individuals clear debts faster and reducing delinquencies for our partners. We treat people facing financial difficulties with respect and dignity, providing the tools and resources they need to thrive. Our team includes experts from companies like Palantir, Google, Stripe, and esteemed government leaders.
- Funding: Backed by over $50 million in funding from top investors such as 8VC, Kapor Capital, XYZ Ventures, and Howard Schultz.
- Recognition: Recognized as one of Fast Company’s “World’s Most Innovative Companies of 2022.”