About the Company
A top-tier global leader in the insurance and reinsurance industry with operations in over 50 countries.
Distinguished by exceptional financial strength and a massive global distribution network.
Employs a diverse workforce of approximately 43,000 professionals worldwide.
Operates with a "global scale, local touch" approach to serve a wide range of clients.
An inclusive employer committed to equal opportunity and fair hiring practices across all regions.
About the Role
Founding Leadership: You will be a key hire for a newly established department, responsible for building and scaling a new SRE team from the ground up.
Global Systems Support: Provide senior-level SRE expertise for mission-critical applications deployed and used across the globe.
Reliability Engineering: Lead initiatives to improve the availability, performance, and resilience of complex infrastructure.
Strategic Automation: Champion an "automation-first" mindset to eliminate manual toil and optimize system monitoring and response.
Performance Standards: Own the definition and measurement of Service Level Indicators (SLIs) and Service Level Objectives (SLOs) for the platform.
Requirements
Cloud Infrastructure: Extensive hands-on experience managing and scaling environments within Azure and AWS.
System Mastery: Deep technical knowledge of Linux/Unix systems, including advanced networking fundamentals.
Automation & Code: Proficiency in programming and configuration languages such as Python, Ansible, PowerShell, .Net, or Java.
Container Orchestration: Expert-level experience with Docker and Kubernetes for managing distributed microservices.
Observability Stack: Mastery of monitoring tools like App Dynamics, Dynatrace, Grafana, and the ELK stack to drive data-driven reliability.