Senior Site Reliability Engineer
Chainlink LabsFull Time
Senior (5 to 8 years)
Candidates should possess a BS in Computer Science, Information Technology, Business/Management Information Systems, or a related field. A minimum of 8+ years of professional experience in coding, designing, developing, and analyzing data is required, along with proficiency in at least two modern enterprise programming languages, experience with various APIs and external services, and familiarity with both relational and NoSQL databases. Experience with public and private clouds, Jenkins, Terraform, Ansible, OpenShift, Kubernetes, or AWS EKS is also necessary. Preferred qualifications include 10+ years of professional experience and experience with IBM Rational Tools.
The Principle Site Reliability Engineer will be responsible for the availability, latency, performance, efficiency, change management, monitoring, emergency response, and capacity planning of systems. They will apply a software engineering mindset to system administration, splitting time between operations/on-call duties and developing systems to enhance site reliability and performance. This role involves collaborating with DevOps, Development, and Business partners to gather requirements, participating in architecture and R&D discussions for new technologies, and implementing chaos engineering practices to identify and remediate system failures. The engineer will push systems to their performance limits, design solutions for improvement, and utilize DevOps and GitOps practices for automation and self-service. Key duties include safeguarding reliability through high availability, disaster resilience, self-monitoring, and self-healing systems, running game days to test reliability assumptions, reviewing designs for platform stability, building systems for proactive monitoring, improving monitoring and alerting systems, troubleshooting systems and network issues, mentoring other engineers in reliability skills, evolving the SDLC and tooling for SRE best practices, and developing runbooks and documentation.
Payment technologies and software solutions