Gorgias

Senior Site Reliability Engineer

Berlin, Berlin, Germany

Not SpecifiedCompensation
Senior (5 to 8 years)Experience Level
Full TimeJob Type
YesVisa
Ecommerce, AI Platforms, Cloud Infrastructure, Software as a ServiceIndustries

Position Overview

  • Location Type: Remote
  • Job Type: Full-Time
  • Salary: Not specified
  • Relocation: Required to Paris, Lisbon, or Belgrade (Relocation and visa provided)

Gorgias is a conversational AI platform for e-commerce, designed to boost sales and streamline support inquiries. It is utilized by over 15,000 e-commerce brands, supporting businesses from independent shops to globally recognized brands. Gorgias is built for Shopify and integrates advanced e-commerce capabilities, enabling personalized customer interactions through its conversational AI, which understands brand specifics, tools, policies, and customer data. This allows for efficient handling of tasks like order modifications, returns, and product recommendations. Gorgias aims to make every customer interaction personal, transform support into sales opportunities, and shape success through conversations.

About The SRE Team

The Site Reliability Engineering (SRE) team at Gorgias is responsible for maintaining the core infrastructure and services that power the product. The team works with high-throughput systems and large-scale data stores, handling billions of queries daily with sub-millisecond response times. They also design and manage the software delivery stack, including features like metrics-based canary rollout strategies for all internal development teams. The team currently consists of 9 Senior and Staff SREs globally, with plans to expand to 12. The team prioritizes scalable methods to maximize impact across the organization.

Notable Team Achievements:

  • Reduced PostgreSQL Vacuum time by 5x through partitioning of multi-TB tables. This involved in-depth problem analysis, strategy development, query analysis, and utilizing Debezium and Kafka for live copying, achieving this with less than 20 minutes of maintenance and no data loss.
  • Split PostgreSQL connection proxy into multiple pools to guarantee quotas per service, isolating database-intensive sub-systems and preventing widespread incidents due to connection starvation. This involved deep backend investigation, contributing to the fix, and guiding teams through migration.
  • Collaborated with all product-engineering teams to achieve SOC2 certification, managed a Hackerone program, refactored incident management using Rootly for improved visibility and resolution times, and enhanced overall security posture.
  • Continuously works on upgrading self-hosted PostgreSQL and RabbitMQ, along with other critical infrastructure components, ensuring minimal downtime and high accuracy.

What You Will Do

  • Manage multi-TB PostgreSQL clusters in the public cloud, including parameter optimization, storage configuration, and data structure refinement.
  • Operate RabbitMQ and Redis at high volumes, handling tens of thousands of operations per second.
  • Manage over 10 full-featured GKE clusters globally, supporting more than 10,000 tenants.
  • Adopt and implement new technologies within the stack, including Kafka, Debezium, and Apache Flink.
  • Facilitate large-scale rollout strategies using Gitlab CI and ArgoCD.
  • Implement and promote best practices across Product-Engineering teams for Kubernetes/Helm/Operators, SLIs/SLOs, Incident Management, Observability, Security, and Disaster Recovery.
  • Automate complex infrastructure components for the global footprint using Infrastructure as Code (IaC) with Terraform and robust scripting in Python/Golang.

What You Should Have

  • Experience with cloud-native web systems operating at scale.
  • A Bachelor's degree in Computer Science or equivalent work experience.
  • 5+ years of experience as a Site Reliability Engineer or in a similar role, with a strong focus on maintaining high-performance, scalable, and reliable high-throughput web systems.
  • Proficiency in using Kubernetes for container orchestration.

Skills

Site Reliability Engineering
Systems Reliability
Scalability
Performance Optimization
Data Storage
PostgreSQL
Partitioning
Debezium
Kafka
Metrics-based Deployment
Canary Rollouts
High Throughput Systems
TB-scale Data
Monitoring
Infrastructure Automation

Gorgias

AI-powered customer service for e-commerce

About Gorgias

Gorgias provides an AI-powered customer service solution tailored for Shopify stores in the e-commerce sector. Its main service automates the customer support process by sorting, prioritizing, and tagging customer inquiries, known as "tickets," as they arrive. This automation minimizes the need for manual ticket management, allowing businesses to save time and resources. Gorgias' AI can also deliver instant resolutions to customer issues, enhancing the efficiency of support operations. Additionally, the platform enables businesses to create detailed customer profiles that include order history, loyalty status, and reviews, which helps support teams offer personalized service. Gorgias operates on a subscription model, generating consistent revenue while helping businesses improve their customer service and reduce support workloads. The company caters to a diverse range of clients, from small businesses to large corporations, all within the competitive e-commerce landscape.

San Francisco, CaliforniaHeadquarters
2015Year Founded
$98.6MTotal Funding
SERIES_CCompany Stage
Consumer Software, AI & Machine LearningIndustries
201-500Employees

Benefits

Competitive salary
Health coverage
Generous equity package
Company offsites
Latest laptop
16-week parental leave
Catered meals
4 Weeks of vacation
Retirement benefits
Fully stocked kitchen

Risks

Increased competition from Zendesk and Freshdesk may impact Gorgias's market share.
Rapid expansion to 300 employees across 16 countries may cause operational inefficiencies.
Reliance on Shopify integration poses risks if Shopify changes policies or competes.

Differentiation

Gorgias offers AI-powered customer service tailored for ecommerce businesses.
The platform integrates seamlessly with Shopify, enhancing customer support efficiency.
Gorgias provides omnichannel support, including email, voice, SMS, and social media.

Upsides

Recent $29M funding will expand AI tools, automating 60% of customer support.
Gorgias serves over 15,000 ecommerce brands, including Steve Madden and Glossier.
The trend of hyper-personalization boosts demand for Gorgias's AI-driven solutions.

Land your dream remote job 3x faster with AI