Daisy Group is hiring a Remote Site Reliability Engineer (SRE)

Job Description

What does a day look like for you here?

Use the key practices of SRE to provide operational support to customers.
Work with the customer to establish the SLO/I/A and appropriate monitoring process to support these service levels.
Manage the release of new features/components against the pre-agreed error budget.
Work with the customer to establish an effectiveness process for Pre-Production Reviews
Spend approximately 50% of time Developing tools and automation to streamline deployment, monitoring, and maintenance processes.
Support the engineering team in developing automated operational tests to demonstrate a reliability baseline.
Interface directly with the Change Squad to address poorly performing services.
Collaborate with cross-functional teams to identify and address performance bottlenecks and reliability issues.
Conduct regular performance analysis and capacity planning to ensure optimal system performance and resource utilisation.
Implement and maintain monitoring, alerting, and logging solutions to proactively identify and address issues.
Serve as a technical point of contact for clients, providing guidance on their infrastructure, technology selection, and best practices.
Participate in client meetings and project discussions to understand business objectives and requirements and aligning technical solutions accordingly.
Provide ongoing support and troubleshooting assistance to address clients' technical issues and concerns (including out-of-hours support where required)

So, what are we looking for?

Proven experience as a customer facing Site Reliability Engineer (SRE).
Experience working with IaC tools such as Terraform, Git, and CI/CD.
Working knowledge of a configuration manager such as Azure DevOps.
Experience in implementing and managing monitoring and logging solutions.
Experience in implementing and automating solutions on Public Cloud platforms (Azure, GCP, AWS).
Exposure to containerisation technologies such as Docker and container orchestration platforms like Kubernetes.
Understanding of security, networking, cloud computing, and distributed systems concepts.

See more jobs at Daisy Group