Senior Site Reliability Engineer (SRE) at Intercom

Montreal, Quebec, Canada

Intercom Logo
Not SpecifiedCompensation
Senior (5 to 8 years), Expert & Leadership (9+ years)Experience Level
Full TimeJob Type
UnknownVisa
Technology, SoftwareIndustries

Requirements

  • Bachelor's degree in software engineering, computer science or equivalent
  • Minimum of 7 years experience in cloud management, development and/or SRE responsibilities
  • Experience in Agile methodology and technical project execution
  • Knowledgeable in DevOps concepts, AWS, Azure, GCP, observability tools (Datadog, Cloudflare), Terraform, PagerDuty and how to integrate all these things together
  • Strong initiative and resilience, with a demonstrated ability to explore new ideas and innovative approaches to solving complex problems
  • Excellent interpersonal and communication skills in both French and English
  • Able and comfortable evolving in fast-moving environment
  • On-call availability is required for the initial months to observe and refine existing processes

Responsibilities

  • Incident Management: Detect and respond to issues, ensuring rapid recovery to minimize downtime; improve coordination and structure in investigations; define and implement an escalation process; ensure communication and adhesion of all stakeholders; document incident reports and conduct post-mortems
  • Collaboration: Work closely with development and operations teams to ensure smooth deployment and operation of applications; provide primary operational support and engineering for large-scale distributed software applications; collaborate with development teams to improve services through rigorous testing and release procedures; participate in system design consulting, platform management, and capacity planning
  • Influence: Create sustainable systems and services through automation and enhancements; promote a culture of innovation and continuous improvement; coordinate the SRE team in establishing and executing operational policies that promote agility and scalability; coordinate and mentor SRE team members
  • Automation: Automate repetitive tasks to improve efficiency and reduce human errors; improve the reliability, quality, and time-to-market of software solutions; measure and optimize system performance anticipating business needs
  • Monitoring and Alerting: Implement and enhance monitoring systems (e.g., Datadog); monitor and maintain the production environment ensuring high availability and system health; gather and process metrics from operating systems and applications; develop a health monitoring dashboard
  • Disaster Recovery: Prepare and implement disaster recovery plans to manage unexpected outages
  • Performance Optimization: Continuously improve system performance and scalability
  • Capacity Planning: Ensure the infrastructure can handle current and future demands
  • Chaos Engineering: Intentionally introduce failures to test system resilience and improve robustness

Skills

SRE
Incident Management
On-call
Post-mortems
Datadog
Monitoring
Alerting
Automation
Distributed Systems
Capacity Planning
System Design
Deployment

Intercom

Customer communication platform for businesses

About Intercom

Intercom provides a customer communication platform that enables businesses to connect with their customers through personalized messaging and automation. The platform includes tools for live chat, email marketing, and customer support, allowing companies to manage interactions in one place. Intercom operates on a subscription model, offering various pricing tiers based on the features and scale needed by clients, which range from small startups to large enterprises across different industries. What sets Intercom apart from its competitors is its integration of multiple communication tools and analytics features, which help businesses assess the effectiveness of their customer engagement strategies. The main goal of Intercom is to enhance customer experience by facilitating better communication between businesses and their customers.

San Francisco, CaliforniaHeadquarters
2011Year Founded
$234.2MTotal Funding
SERIES_DCompany Stage
Consumer Software, Enterprise SoftwareIndustries
1,001-5,000Employees

Benefits

We reward our people - All full-time Intercom employees are offered a rewards program that includes competitive base salary, bonus, equity, and benefits.
We love to learn - There are so many opportunities here to learn and build a career. We support employees' growth by offering a wide range of continuing education options, including core skills, management training, and guided learning.
We take employee wellbeing seriously - We offer comprehensive healthcare coverage, including medical, dental, and vision, as well as employee assistance programs and mental health resources. We also offer flexible time off, significant paid family leave, and more.
The world has changed. So has the way we work - The majority of our employees say they prefer a hybrid model of work—some in-office time, some WFH—so that's our current approach. No matter what, though, we're committed to listening to our team.

Risks

Emerging startups offer similar solutions at lower prices, threatening market share.
Major tech companies' AI advancements may outpace Intercom's offerings.
Stricter data privacy regulations could increase compliance costs for Intercom.

Differentiation

Intercom integrates live chat, email marketing, and support into a single platform.
The platform supports over 600 million monthly active end users globally.
Intercom's analytics tools help businesses measure communication strategy effectiveness.

Upsides

Demand for AI-driven support solutions enhances Intercom's offerings.
Omnichannel communication trends align with Intercom's integrated messaging solutions.
E-commerce expansion drives demand for robust customer support systems like Intercom.

Land your dream remote job 3x faster with AI