General Motors

Staff Software Engineer - Site Reliability and Observability

Austin, Texas, United States

Not SpecifiedCompensation
Expert & Leadership (9+ years)Experience Level
Full TimeJob Type
UnknownVisa
AutomotiveIndustries

Employment Type

Full time

Work Arrangement

Hybrid: This role is categorized as hybrid. This means the successful candidate is expected to report to either Austin, TX or Atlanta, GA at their respective innovation centers three times per week.

The Role

The Software Engineering Site Reliability Engineer (SRE) is responsible for ensuring the reliability, scalability, and performance of software systems.

Job Profile

  • System Monitoring and Troubleshooting: Monitoring the performance and availability of software systems, identifying and resolving issues, and implementing proactive measures to prevent future incidents.
  • Automation and Infrastructure: Developing and maintaining automation tools and infrastructure to streamline software deployment, configuration management, and system monitoring.
  • Performance Optimization: Analyzing system performance, identifying bottlenecks, and implementing optimizations to improve the efficiency and scalability of software systems.
  • Incident Response and Root Cause Analysis: Responding to incidents, conducting root cause analysis, and implementing corrective actions to prevent similar incidents in the future.
  • Collaboration with Development Teams: Collaborating with software development teams to ensure that reliability and scalability considerations are incorporated into the software design and implementation.
  • Continuous Improvement: Identifying opportunities for process improvement, implementing best practices, and driving initiatives to enhance the reliability and performance of software systems.

What You'll Do

  • Implement scalable, reliable, secure SRE and Observability platform to monitor health of our production system and provide a holistic view of the environment.
  • Deliver tools/software to improve the reliability, scalability and operability of services.
  • Collaborate with engineering teams to analyze and provide inputs in architecture, infrastructure resources, observability to achieve reliability and scalability goals.
  • Collaborate with engineering teams to conduct production readiness reviews, deployment, operation and refinement.
  • Partner with stakeholders to ensure data and observability tools are effectively integrated with other systems and processes.
  • Partner with stakeholders to identify, measure and monitor availability, latency and overall service health.
  • Participate in on-call engineering duty to support production.
  • Instill Site Reliability best practice through automation, data insights, and observability.
  • Perform initial incident root cause analysis with engineers, carryout incident postmortem.
  • Build run books, tooling to carry out production support activities.
  • Actively participate in technical discussions and deep dives with Architectural group.

Your Skills & Abilities (Required Qualifications)

  • 7+ years of hands-on SRE experience (software development, systems monitoring) with at least one of the public cloud providers – Azure (strongly preferred), AWS, GCP.
  • Experience operating high-availability, fault-tolerant, scalable, distributed software in production: Building monitoring, defining alerts, writing run books, establishing dashboards etc.
  • Experience with monitoring and log aggregation frameworks, such as Azure Monitor/Sentinel, Datadog (preferred), Dynatrace, Elasticsearch, Kibana, Logstash.
  • Strong working knowledge of Docker, Kubernetes, Terraform, Chef or Ansible.
  • Experience troubleshooting JVM based applications.
  • Chaos engineering implementation and experience a big plus.
  • Extensive knowledge Infrastructure as a code tool Terraform.
  • Extensive knowledge of Trace monitoring, installation and configuration of Open telemetry.
  • Strong experience in scripting/programming – Python, Java, Go, PowerShell, Bash.
  • Experience with configuration and management of SSO, Big Data/ No-SQL in cloud infrastructure.
  • CI/CD automation frameworks knowledge - Jenkins/Azure DevOps.
  • Strong understanding of public cloud networking components.

Skills

Site Reliability Engineering
Observability
System Monitoring
Troubleshooting
Automation
Infrastructure
Performance Optimization
Incident Response
Root Cause Analysis
Scalability
Reliability
Software Deployment
Configuration Management

General Motors

Designs, manufactures, and sells vehicles

About General Motors

General Motors designs, manufactures, and sells vehicles and vehicle parts, catering to individual consumers, businesses, and government entities. The company operates in both traditional internal combustion engine vehicles and the growing electric vehicle (EV) market, generating revenue through vehicle sales and financing services. GM stands out from competitors with its commitment to community service, sustainability, and diversity, as evidenced by a majority female Board of Directors. The company's goal is to balance traditional automotive manufacturing with technological advancements in electric and autonomous vehicles.

Detroit, MichiganHeadquarters
1908Year Founded
$486.7MTotal Funding
IPOCompany Stage
Automotive & Transportation, Financial ServicesIndustries
10,001+Employees

Benefits

Paid Vacation
Paid Sick Leave
Paid Holidays
Parental Leave
Health Insurance
Dental Insurance
Vision Insurance
Life Insurance
401(k) Company Match
401(k) Retirement Plan
Tuition Reimbursement
Student Loan Assistance
Flexible Work Hours
Discount on GM vehicles

Risks

Shutting down Cruise Robotaxi may affect investor confidence in GM's AV strategy.
Chevrolet Equinox EV recall could harm GM's safety reputation.
Leadership transition in design may disrupt continuity and brand identity.

Differentiation

GM's Dynamic Fuel Management system enhances fuel efficiency in traditional vehicles.
GM leads in board diversity with 55% women directors.
GM's pivot to personal autonomous vehicles aligns with consumer trends.

Upsides

Partnership with Nvidia boosts GM's autonomous vehicle technology capabilities.
Collaboration with ChargePoint expands EV charging infrastructure, enhancing consumer appeal.
Bryan Nesbitt's appointment as design head may bring innovation to GM's vehicle design.

Land your dream remote job 3x faster with AI