Upwork

Principal Site Reliability Engineer

Remote

Not SpecifiedCompensation
Senior (5 to 8 years)Experience Level
Full TimeJob Type
UnknownVisa
Information Technology & ServicesIndustries

Job Description

Company: Upwork

Position Overview: Upwork is the world’s work marketplace, serving businesses from startups to over 30% of the Fortune 100. This role is part of Upwork’s Hybrid Workforce Solutions (HWS) Team, a global group supporting Upwork’s business operations. This is a full-time position (~40 hours per week, Monday-Friday) with participation in a production on-call rotation (once every 2-3 weeks). The role involves contributing to the continuous improvement of Upwork's environment and working with a major revenue-producing website with millions of users.

Work/Project Scope:

  • Serve as a technical leader in modern SRE practices with a focus on zero-trust infrastructure, platform observability, and cloud-native scalability.
  • Guide the architectural evolution of reliability systems, including multi-cluster Kubernetes environments, GitOps workflows, and service mesh integration.
  • Champion SLO-driven engineering across teams and establish frameworks for defining, tracking, and enforcing reliability standards.
  • Partner with platform and security teams to enable service-to-service authentication, policy enforcement, and resilient control planes.
  • Develop AI-assisted tools and workflows (e.g., for incident triage, RCA generation, auto-remediation) to reduce operational burden and accelerate resolution.
  • Define and maintain end-to-end observability strategies including distributed tracing, metrics pipelines, and log enrichment.
  • Drive infrastructure automation efforts using IaC best practices, with an emphasis on policy-as-code, workload identity, and platform governance.
  • Lead post-incident reviews and reliability audits to surface systemic gaps and drive continuous improvement.
  • Mentor engineers across infrastructure and application teams on designing and operating reliable, scalable systems.

Requirements (Must Haves):

  • Experience: 10+ years in SRE, DevOps, or production engineering roles, including experience operating large-scale distributed systems in production.
  • Kubernetes Expertise: Deep expertise in Kubernetes operations, including multi-cluster orchestration, service mesh (Istio or equivalent), and workload policy management (e.g., OPA, Kyverno).
  • GitOps: Proven experience building and maintaining GitOps pipelines using tools like ArgoCD or Flux.
  • Observability: Strong fluency in observability tooling (e.g., Prometheus, OpenTelemetry, Grafana, or Datadog), with a focus on SLO-based alerting and incident detection.
  • Automation & AI: Familiarity with reliability-as-code practices and automation using scripting languages (Python, Go, or Bash) and AI-enhanced workflows (e.g., Cursor, incident bots, PR-generating agents).
  • Zero Trust: Experience designing and enforcing zero trust service-to-service authentication, workload identity, and mTLS policies.
  • Incident Management: Track record of leading incident review programs, standardizing postmortems, and driving systemic reliability improvements.
  • Collaboration: Ability to work cross-functionally with platform, security, and developer enablement teams to embed resilience across the SDLC.

Company Information: Upwork is the world’s work marketplace. We serve everyone from one-person startups to over 30% of the Fortune 100 with a powerful, trust-driven platform that enables companies and talent to work together in new ways that unlock their potential. Last year, more than $3.8 billion of work was done through Upwork by skilled professionals who are gaining more control by finding work they are passionate about and innovating their careers.

Upwork is proudly committed to fostering a diverse and inclusive workforce. We never discriminate based on race, religion, color, national origin, gender (including pregnancy, childbirth, or related medical condition), sexual orientation, gender identity, gender expression, age, status as a protected veteran, status as an individual with a disability, or other applicable legally protected characteristics.

Application Instructions: To learn more about how Upwork processes [Information truncated in the original description].

Skills

SRE practices
zero-trust infrastructure
platform observability
cloud-native scalability
multi-cluster Kubernetes
GitOps workflows
service mesh
SLO-driven engineering
reliability standards
service-to-service authentication
policy enforcement
resilient control planes
AI-assisted incident triage
RCA generation
auto-remediation
distributed tracing
metrics pipelines

Upwork

Online platform connecting freelancers and clients

About Upwork

Upwork connects freelancers with clients looking for various services in the gig economy, which focuses on short-term contracts instead of permanent jobs. The platform allows freelancers to create profiles that showcase their skills, while clients can post job listings for specific projects. Freelancers bid on these projects, and clients choose the best candidates based on proposals and reviews. Upwork earns revenue through service fees charged to freelancers based on their earnings, with a tiered structure that rewards long-term client relationships. The platform also offers premium memberships and additional services for enhanced visibility and access to job listings. Upwork provides tools for time tracking, invoicing, and project management, making it easier for both freelancers and clients to manage their work and payments. The goal of Upwork is to facilitate successful project completion by bridging the gap between freelancers and clients.

San Francisco, CaliforniaHeadquarters
2015Year Founded
$143.8MTotal Funding
IPOCompany Stage
Consulting, Enterprise SoftwareIndustries
10,001+Employees

Benefits

Health Insurance
Unlimited Paid Time Off
401(k) Retirement Plan
401(k) Company Match
Parental Leave
Employee Stock Purchase Plan

Risks

Increased competition from Fiverr and Toptal threatens Upwork's market share.
The new Fiverr-style Project Catalog may commoditize services, reducing freelancers' perceived value.
Strategic shifts under new management may not align with current client expectations.

Differentiation

Upwork connects freelancers with clients across diverse industries, enhancing global work opportunities.
The platform offers tools like time tracking and invoicing for efficient project management.
Upwork's tiered fee structure incentivizes long-term client relationships, differentiating it from competitors.

Upsides

Upwork's acquisition of Objective AI enhances its AI capabilities for better talent matching.
The introduction of Featured Jobs increases visibility for job posts, attracting more candidates.
Upwork's recognition as the top job posting site boosts its credibility among employers.

Land your dream remote job 3x faster with AI