Crowdstrike

IT Monitoring Engineer/Site Reliability Engineer (Shift -12PM-9PM IST) (Remote)

Maharashtra, India

Not SpecifiedCompensation
Junior (1 to 2 years)Experience Level
Full TimeJob Type
UnknownVisa
CybersecurityIndustries

Monitoring Engineer / Site Reliability Engineer (SRE)

Employment Type: Full time

Position Overview

As a global leader in cybersecurity, CrowdStrike protects the people, processes and technologies that drive modern organizations. Since 2011, our mission hasn’t changed — we’re here to stop breaches, and we’ve redefined modern security with the world’s most advanced AI-native platform. Our customers span all industries, and they count on CrowdStrike to keep their businesses running, their communities safe and their lives moving forward. We’re also a mission-driven company. We cultivate a culture that gives every CrowdStriker both the flexibility and autonomy to own their careers. We’re always looking to add talented CrowdStrikers to the team who have limitless passion, a relentless focus on innovation and a fanatical commitment to our customers, our community and each other. Ready to join a mission that matters? The future of cybersecurity starts with you.

The CrowdStrike Information Technology team is looking for a skilled Monitoring Engineer/Site Reliability Engineer (SRE) to join our IT Operations team. In this role, you will be responsible for designing, implementing, and maintaining monitoring solutions that ensure the reliability, availability, and performance of our critical IT infrastructure and applications. You will work at the intersection of operations and development, applying software engineering principles to operations tasks while focusing on system reliability and automation. This position requires a proactive approach to identifying and resolving issues before they impact business operations, as well as participating in on-call rotations to address incidents when they occur.

What You’ll Need

  • 5+ years of experience with enterprise monitoring tools (Prometheus, LogicMonitor, Datadog, ThousandEyes, Zscaler Digital Experience (ZDX))
  • Strong proficiency in scripting languages (Python, Bash, PowerShell) for automation
  • Experience with log management platforms (ELK stack, Splunk, LogScale)
  • Working knowledge of cloud services monitoring (AWS CloudWatch, GCP)
  • Experience with application performance monitoring (APM), digital experience monitoring (DEM) and infrastructure monitoring
  • Knowledge of SRE principles, SLOs, error budgets, and incident management
  • Experience with automated alerting, remediation workflows, and CI/CD pipeline monitoring
  • Familiarity with Infrastructure as Code (Terraform, Ansible) and containerization (Docker, Kubernetes)
  • Strong incident triage, root cause analysis, and documentation skills
  • Experience participating in on-call rotations and emergency response

What You'll Do

Monitoring and Reliability

  • Design and maintain comprehensive monitoring solutions across infrastructure and applications
  • Configure appropriate alerting thresholds to ensure timely response to potential issues
  • Define and track SLOs and error budgets for critical services
  • Create and maintain dashboards providing real-time visibility into system health
  • Conduct regular reviews of system reliability and recommend improvements

Incident Management and Operations

  • Participate in on-call rotation to respond to alerts and incidents
  • Lead incident response efforts and conduct thorough post-incident reviews
  • Document incidents, resolutions, and lessons learned
  • Develop and refine incident response procedures to improve MTTR
  • Implement proactive monitoring to detect potential issues before they impact users

Automation and Collaboration

  • Develop scripts and automation to streamline monitoring tasks and reduce manual effort
  • Create self-healing systems that can automatically remediate common issues
  • Integrate monitoring tools with other operational systems
  • Work closely with development, infrastructure, and security teams
  • Provide guidance on monitoring best practices and observability
  • Maintain comprehensive documentation for monitoring systems and procedures

Continuous Improvement

  • Stay current with industry trends in monitoring and site reliability engineering
  • Analyze monitoring data to identify patterns and impr

Skills

Prometheus
LogicMonitor
Datadog
ThousandEyes
Zscaler Digital Experience (ZDX)
Python
Bash
PowerShell
ELK stack
Splunk
LogScale

Crowdstrike

Cloud-native endpoint security solutions provider

About Crowdstrike

CrowdStrike specializes in cybersecurity, focusing on protecting businesses from cyber threats through cloud-native endpoint security solutions. Their main product, the Falcon platform, includes services like Falcon Pro, which replaces traditional antivirus with next-generation antivirus that integrates threat intelligence, Falcon Insight for endpoint detection and response, and Falcon Device Control to manage connected devices. Unlike many competitors, CrowdStrike's services are subscription-based, allowing clients to choose different levels of protection based on their needs. The company serves a diverse clientele, including many Fortune 100 companies, and is recognized as a leader in the cybersecurity field, known for its effectiveness in threat detection and response.

Austin, TexasHeadquarters
2011Year Founded
$468MTotal Funding
IPOCompany Stage
Enterprise Software, CybersecurityIndustries
5,001-10,000Employees

Benefits

Competitive Employee Stock Purchase Plan
Remote-friendly culture
Market leader in compensation and equity awards
Competitive vacation and flexible working arrangements
Comprehensive health benefits + 401k plan
Paid Parental Leave, including adoption
Wellness programs
Professional development and mentorship opportunities
Open offices have stocked kitchens, coffee, soda and treats

Risks

Increased competition from companies like Lumos could challenge CrowdStrike's market share.
Recovery from last year's outage may still affect customer trust and future sales.
Pressure to demonstrate ROI by 2025 could challenge CrowdStrike's financial transparency.

Differentiation

CrowdStrike's Falcon platform offers cloud-native endpoint security solutions, a key differentiator.
The company serves 44 of the Fortune 100, showcasing its strong market presence.
CrowdStrike's proactive threat hunting sets it apart in cybersecurity threat detection.

Upsides

Partnership with SonicWall opens new SMB market segment for CrowdStrike.
Recognition as a leader in ransomware prevention boosts CrowdStrike's market credibility.
Gamified learning initiatives help address cybersecurity skills gap, benefiting future talent pipeline.

Land your dream remote job 3x faster with AI