Monitoring at giffgaff

Monitoring remains a critical part of managing any IT system. Monitoring allows service owners to keep track of a system’s health and availability, and detect and prevent failures.

At giffgaff we run a system made up of hundreds of servers that scale up and down all the time, and a rapidly growing number of microservices running along with our legacy applications. The number of moving parts grows very quickly, and finding out what goes wrong when things go wrong becomes a challenge.

9 months back we decided to build a new monitoring and alerting system that would…

--

--

SRE engineer | Technology enthusiast | Learning&Sharing