Site Reliability Manager

Location: Reading



Job Description

The Site Reliability Engineer is a senior technical role in the business. The role is to work as part of the Site Reliability Team to implement and optimise systems to improve operational efficiency and reliability.

The role is largely internally focused working with the other technical departments; however, every deliverable will ultimately drive a customer impact so it is expected that there may be occasional customer interaction.


Key Responsibilities

  • Maintain stability and uptime of the monitoring platform
  • Develop toolsets to enable the technical teams to monitor their areas of responsibility
  • Develop toolsets for customers to monitor the services they consume from the organisation
  • Develop and document processes for onboarding new services onto the monitoring platform
  • Contribute to the delivery of a monitoring-as-a-service strategy

Essential Skills

  • Scripting or programming knowledge (Python, Typescript and Powershell)
  • Linux Systems Administration (Centos)
  • Experience with configuration management tools (Ansible)
  • An ability to communicate complex technical topics to a less technical audience when dealing with customers or other parts of the business

Ready to Apply?

To apply, please submit a CV and cover letter to Recruitment Manager, David Wilson at recruitment@node4.co.uk
No agencies please.