Wikimedia Foundation

Site Reliability Engineering Manager

Remote

Not SpecifiedCompensation
Senior (5 to 8 years), Expert & Leadership (9+ years)Experience Level
Full TimeJob Type
UnknownVisa
Biotechnology, Non-profitIndustries

Engineering Manager, Site Reliability Engineering

Employment Type: [Not Specified] Location Type: [Not Specified] Salary: [Not Specified]

Position Overview

The Wikimedia Foundation is seeking an Engineering Manager to join our Site Reliability Engineering (SRE) team. Reporting to the Director of Site Reliability Engineering, you will be responsible for supporting the engineers who develop our infrastructure and maintain the services used by hundreds of millions of people worldwide.

Responsibilities

  • Managing one to two globally distributed teams within Wikimedia’s Site Reliability Engineering organization.
  • Providing guidance, mentorship, and support to ensure team effectiveness and growth.
  • Collaborating with team members to set individual performance goals and support their career development.
  • Recruiting, hiring, and onboarding new team members.
  • Triaging incoming workload, maintaining focus on priorities, and setting realistic expectations for peers and team members.
  • Coordinating and communicating with other members of the Wikimedia product & engineering teams on relevant projects.
  • Executing complex projects and contributing to organizational strategy.
  • Continuously developing the team's roadmap in alignment with other SRE and Product & Technology teams.
  • Drafting and executing the team’s annual and quarterly plans.
  • Project managing new and existing initiatives.
  • Leading the definition, refinement, and execution of team processes.
  • Leading incident response, diagnosis, and follow-up on system alerts and outages across Wikimedia’s production infrastructure.
  • Participating in a 24/7 on-call rotation for escalations and issue resolution.
  • Facilitating the definition and establishment of Service Level Indicators (SLIs) and Objectives (SLOs) with service owners and stakeholders.

Requirements

  • Prior experience managing teams.
  • Prior hands-on experience with software or reliability engineering (within the last 3 years preferred).
  • Ability to analyze complex systems, troubleshoot issues, and devise effective solutions under pressure.
  • Proficiency in project management methodologies.
  • Strong understanding of:
    • Cloud computing
    • Networking
    • Linux systems administration
    • Containerization (e.g., Docker, Kubernetes)
    • Infrastructure as code (e.g., Terraform, Ansible)
  • Aptitude for automation and streamlining tasks.
  • Effective communication skills in spoken and written English.
  • Ability to work independently and as part of a globally distributed team.
  • Ability to travel several times a year for occasional in-person meetings.
  • B.S. or M.S. in Computer Science or equivalent related work experience.

Qualities We Value

  • Commitment to the organization's mission, values, and guiding principles.
  • Ability to disagree respectfully and work towards solutions.
  • Strong asynchronous communication skills.
  • Solutions-focused approach, embracing complexity and limited resources.
  • Self-motivated with the ability to navigate ambiguity and complete projects with limited direction.
  • Curiosity and a commitment to continuous learning.

Additionally, We'd Love It If You Have:

  • Experience working in a distributed, largely remote environment.
  • Experience contributing to open-source projects.

About the Wikimedia Foundation

The Wikimedia Foundation is the nonprofit organization that operates Wikipedia and the other Wikimedia free knowledge projects. Our vision is a world in which every single human can freely share in the sum of all knowledge. We believe that everyone has the potential to contribute something to our shared knowledge, and that everyone should be able to access that knowledge freely. We host Wikipedia and the other Wikimedia projects.

Skills

Team Management
Mentorship
Performance Management
Recruiting
Hiring
Onboarding
Workload Triage
Project Management
Incident Response
On-call Rotation
Service Level Indicators (SLI)
Service Level Objectives (SLO)
Software Engineering
Reliability Engineering

Wikimedia Foundation

Operates Wikipedia and free knowledge projects

About Wikimedia Foundation

The Wikimedia Foundation operates Wikipedia and other free knowledge projects, aiming to create a world where everyone can freely access and share knowledge. It provides a platform for users to read, contribute, and share content, while also supporting the volunteer communities that help maintain these projects. The foundation is funded by donations from individuals and institutions, emphasizing its nonprofit status. Unlike many other organizations, it focuses on making knowledge accessible to all without charge, advocating for policies that support free knowledge initiatives. Its goal is to empower individuals to contribute to and benefit from a collective pool of knowledge.

San Francisco, CaliforniaHeadquarters
2003Year Founded
$145.9MTotal Funding
GRANTCompany Stage
Social Impact, EducationIndustries
501-1,000Employees

Benefits

Remote Work Options

Risks

Reliance on Nvidia's AI tech may affect Wikimedia's data processing autonomy.
DSA audit could reveal vulnerabilities requiring significant resources to address.
Decentralized platforms like Mastodon may divert users from Wikipedia.

Differentiation

Wikimedia Foundation operates the world's largest free online encyclopedia, Wikipedia.
It supports a diverse range of projects like Wiktionary and Wikisource.
The Foundation is a non-profit, relying on global donations for funding.

Upsides

Nvidia's NeMo Retriever tech reduced Wikipedia processing time from 30 days to 3 days.
Holistic AI's audit under the DSA enhances Wikimedia's platform safety and accountability.
Collaboration with Open Foundation West Africa combats misinformation during Ghana's elections.

Land your dream remote job 3x faster with AI