Credit to:

Note: for a basic understanding of Fluentd, have a look at the following article

The Fluentd json parser plugin, one of the many Fluentd plugins, is in charge of parsing JSON logs. In combination with dynamic mapping, makes it very easy to ship logs in JSON format to an Elasticsearch cluster. This can be very useful but poses two problems: incorrect field types, and mapping explosions, as sometimes you don’t have much control over the data that you receive.

To learn more about Elasticsearch Index Management and performance, check out the following article

Incorrect field types occur when Elasticsearch assigns…

Fluentd is an open source data collector, which lets you unify the data collection and consumption for a better use and understanding of data. Fluentd tries to structure data as JSON as much as possible: this allows Fluentd to unify all facets of processing log data: collecting, filtering, buffering, and outputting logs across multiple sources and destinations.

Fluentd architecture. Image credit:

At giffgaff, we’ve chosen Fluentd as our data collector. We run Fluentd as a daemonset in our Kubernetes cluster. This setup guarantees the logs of all pods running in any of our nodes are collected and shipped to our Elasticsearch cluster. …

Recently we spotted an article in a blog from a specialist observability software company called Coralogix talking about the problems of observability at scale. We were thrilled to see that they have used our architecture as a reference point for how to scale an observability platform. What surprised us even more was how much our architecture has changed since we first wrote about it.

So what does our current setup look like?

At giffgaff we’ve been using NGINX as an Ingress Controller for our Kubernetes cluster from the very beginning. NGINX is the most adopted Kubernetes ingress provider, and has demonstrated to be a solid solution.

For the last year or so we’ve been rolling out Istio to some of our workloads. Istio is a very complex piece of software, and very powerful. And as you know, “with great power comes great responsibility”. You can get a lot of out-of-the-box benefits from the first day, including telemetry, tracing and detailed access logs. But you can also screw things up very easily if…

Logs are underrated in many enterprise environments. Logs are often completely ignored, and only noticed when disk space is low. At that point, they’re usually deleted without review.

Sometimes logs are seen as a way to troubleshoot operational problems. Logs can be a good source of forensic information for determining what happened after an incident. However, we think that proactive logging enables improving business decisions. Logs, and in particular application logs, can contain a wide range of information that is not available otherwise.

Why are logs ignored? Log analysis isn’t easy. Effective log analysis take some work. Logs come in…

The way data is organised across nodes in an Elasticsearch cluster has a huge impact on performance and reliability. A well-optimized configuration can make all the difference.

Data in Elasticsearch is stored in indices. For instance, both index management and configuration play a key role in the performance of an Elasticsearch cluster.

Index overview

What exactly is an index in Elasticsearch? Despite being a very basic question, the answer is surprisingly nuanced.

An index is a collection of documents that have somewhat similar characteristics. …

Monitoring remains a critical part of managing any IT system. Monitoring allows service owners to keep track of a system’s health and availability, and detect and prevent failures.

At giffgaff we run a system made up of hundreds of servers that scale up and down all the time, and a rapidly growing number of microservices running along with our legacy applications. The number of moving parts grows very quickly, and finding out what goes wrong when things go wrong becomes a challenge.

9 months back we decided to build a new monitoring and alerting system that would allow us to…

Matías Costa

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store