Associate Director, Platform Engineering at S&P Global

Hyderabad, Telangana, India

S&P Global  Logo
Not SpecifiedCompensation
Senior (5 to 8 years), Expert & Leadership (9+ years)Experience Level
Full TimeJob Type
UnknownVisa
Financial Services, FinTechIndustries

Requirements

  • Proficient in application and infrastructure observability; Splunk OpenTelemetry preferred
  • A deep understanding and practical application of Site Reliability Engineering principles
  • Ability to build and maintain a system and culture that supports and implements SLOs
  • Experienced in production environments running in AWS
  • Comfortable with Infrastructure as Code; Terraform is preferred
  • Familiar with Docker & Kubernetes, specifically EKS & ECS
  • Familiar with programming languages, with a strong preference for Python (for scripting, automation, and data analysis/AI)
  • Comfortable with CI/CD pipelines such as GitHub Actions or Azure DevOps
  • Understanding of the application lifecycle
  • Familiarity working in an agile environment
  • Ability to review architecture designs, ensuring observability coverage, high availability, resilience, and disaster recovery principles
  • Familiarity with Chaos Engineering principles and experience designing or running controlled experiments to test system resilience
  • Demonstrable interest or experience in AIOps, including the application of AI/ML to operational data and familiarity with platforms like AWS Bedrock
  • Excellent communication skills (inferred from truncated text)

Responsibilities

  • Design, implement, and maintain comprehensive observability solutions to track the health and performance of our systems
  • Analyze observability data and explore AIOps methodologies to identify potential issues, predict failures, and proactively troubleshoot problems before they impact users
  • Develop and implement alerts and notifications for critical events to ensure timely intervention
  • Collaborate with development teams to design and implement solutions that enhance system resilience, partially through designing and executing chaos engineering experiments (e.g., using AWS FIS), to reduce downtime
  • Analyze performance metrics to identify and resolve latency bottlenecks in our infrastructure
  • Implement performance optimization techniques and tools to improve the overall responsiveness of our systems
  • Work with development teams to ensure that new features and code changes do not introduce performance regressions
  • Develop and maintain metrics dashboards to track key performance indicators (KPIs) for our critical systems
  • Identify performance trends and anomalies that may indicate potential issues or areas for improvement
  • Recommend and implement performance optimization strategies to enhance the overall efficiency of our systems
  • Optimize resource utilization and minimize unnecessary expenditure on IT infrastructure
  • Identify and implement cost-effective solutions to improve the efficiency of our IT operations, reducing TOIL
  • Design and implement automated deployment and rollback procedures to mitigate risks associated with software updates
  • Monitor the performance of new releases and address any issues that arise promptly
  • Analyze root causes of incidents to identify and implement preventive measures to minimize their recurrence
  • Document incident responses and communicate lessons learned to enhance our incident handling processes

Skills

SRE
Observability
AIOps
Chaos Engineering
AWS FIS
Performance Optimization
Metrics Dashboards
KPIs
Automation
Alerting
Latency Analysis

S&P Global

Provides financial information and analytics services

About S&P Global

S&P Global provides financial information and analytics to a wide range of clients, including investors, corporations, and governments. The company offers services such as credit ratings, market intelligence, and indices, which help clients understand and navigate the global financial market. S&P Global's products work by utilizing advanced data analytics and research to deliver insights that assist clients in making informed decisions and managing risks. Unlike many competitors, S&P Global has a diverse range of divisions, including S&P Global Ratings and S&P Dow Jones Indices, which allows it to cater to various financial needs. The company's goal is to support clients in driving growth while also committing to corporate responsibility and positive societal impact.

New York City, New YorkHeadquarters
1917Year Founded
IPOCompany Stage
Data & Analytics, Financial ServicesIndustries
10,001+Employees

Benefits

Health Insurance
Unlimited Paid Time Off
Professional Development Budget
401(k) Company Match
Family Planning Benefits
Employee Discounts

Risks

Integration challenges with new acquisitions like ProntoNLP may cause operational issues.
Increased competition from AI-driven platforms like Brooklyn Investment Group.
Dependence on volatile credit ratings market could impact revenue stability.

Differentiation

S&P Global integrates advanced AI tools for superior financial analytics capabilities.
The company offers comprehensive ESG solutions, meeting growing sustainability demands.
S&P Global's diverse divisions provide a wide range of financial services globally.

Upsides

Acquisition of ProntoNLP boosts data analytics and sentiment scoring capabilities.
Rising demand for ESG data enhances S&P Global's market position.
Expansion into India strengthens S&P Global's research and insights offerings.

Land your dream remote job 3x faster with AI