Observability: Monitoring, Logging & Tracing
Tracing Errors with ConSol
Modern applications are complex. Consequently, they generate a multitude of measurement and analysis data providing information on the application’s health. These data also help to fix the problem should an error occur. The observer, though, finds himself confronted with a hardly manageable flood of data. Observability is the ability to holistically monitor IT applications in order to get this flood of data under control. This includes providing appropriate data by the application. Also, developers and operating teams have to be given the tools they need in order to be able to act swiftly and purposefully in the event of an error.

All applications and the underlying infrastructure produce metrics, logs, and where useful traces as well. These are gathered and prepared by proven open-source tools like Prometheus (metrics), Loki (logs) or Jaeger (traces). Subsequently, these data are centrally visualized in Grafana dashboards. At this point, the user gets an overview of the applications and infrastructure components he is sharing. For a long-term storage of data, additional data bases like InfluxDB can be employed.
Observability – Chasing „Mister X“
Observability is basically composed of three components: monitoring, logging, and tracing. The monitoring provides details, when a defined service level or quality criterion has fallen short of. For this, the application developers define appropriate metrics which again are being provided directly from the application. In the logs, we find the error reports of each individual software component. They point out the place in the various services where the error occurs. Tracing allows us to identify the path a call has taken in between services bevor resulting in a problem. By means of correlation IDs, we are able to observe all this information together in a central dashboard. This way we keep the overview even in complex applications and quickly track the source of error.
Observability Tools
For observability applications we favor open-source solutions. Compared with commercial solutions, there is no disadvantage at all. For many years now, we use open-source solutions with our clients as well as in our own productive employment. They also offer a truly remarkable range of functions.

Prometheus is the de facto standard for cloud-native monitoring and alerting. It offers a simple configuration for where and how metrics can be collected. Most applications support the export of metrics to Prometheus. And there is also great support for exporting metrics to Prometheus for self-written applications in all common programming languages.

Loki allows for a simple importing and indexing of logs. Its configuration is derived from Prometheus and aims at quickly finding logs for certain criteria. Therefore, only a very small index can be written. By severe parallelizing the analyses, enquiries can be quickly executed even with large amounts of data.

Grafana is used to visualize metrics. It offers a very good integration of Prometheus, Loki and Jaeger and allows for metrics as well as traces to be displayed in the graphs. It is also possible to jump directly to individual traces and for certain metrics to display the logs to these metrics. Besides a great choice of predefined dashboards with various metrics, the user can also create dashboards himself.

Jaeger supports the OpenTracing standard, thus making it possible to easily integrate applications with Jaeger. For self created applications there is, quite like with Prometheus, a broad support of programming languages and frameworks. Other advantages of Jaeger besides its widespread use include its simple installation and scaling even with larger amounts of data.
OpenShift Service Mesh Based on Istio
Microservice management: A secure and error-free interservice communication is ensured by an intermediary layer in your application which also serves to optimize your application’s performance. This results in less code for your developers who now can concentrate fully on the app’s business value.
