Site Reliability Engineer
Stitch FixFull Time
Mid-level (3 to 4 years)
Candidates should possess professional experience in a blend of Linux systems administration and software development, with a demonstrated ability to write clean, maintainable code, particularly in Python and bash, within structured, team-oriented development environments utilizing code review and source control. Experience deploying and monitoring distributed systems, such as microservices or client/server architectures, is required, along with hands-on experience designing and managing petabyte-scale storage systems like Lustre, BeeGFS, Ceph, or ZFS. Familiarity with containerization technologies like Docker and Singularity, and infrastructure-as-code tools such as Terraform, Ansible, or CDK, is also necessary.
As a Site Reliability Engineer, you will architect and scale Boom’s on-prem and cloud-based HPC infrastructure, supporting GPU, CPU, and hybrid workflows, optimize job scheduling and distributed workload management using tools like SLURM, AWS Batch, and Kubernetes, engineer storage solutions balancing IOPS, throughput, and cost across various file systems, embed with simulation and data teams to identify and eliminate bottlenecks, level up observability across internal applications, automate deployments, upgrades, health checks, and recovery processes, own infrastructure reliability across cloud (AWS) and on-prem environments, collaborate with aerospace engineers and IT partners to reduce failure modes, champion SRE best practices, and mentor teammates while influencing broader software lifecycle strategy.
Develops and manufactures supersonic aircraft
Boom Supersonic develops and manufactures supersonic commercial aircraft aimed at reducing flight times for long-distance travel. Their main product, the Overture aircraft, is designed to run on 100% sustainable aviation fuel, supporting efforts to lower carbon emissions in the aviation industry. Unlike other aviation companies, Boom Supersonic focuses specifically on supersonic travel and has secured partnerships with major airlines like American Airlines and United Airlines, indicating strong market interest. The company's goal is to transform air travel by making it faster and more efficient while promoting sustainability through advanced aerospace technology.