Elasticsearch Observability: logs, metrics, and traces explained

This is precisely the role of observability. It is also why Elasticsearch has gradually established itself as an analytical foundation for logs, metrics, and traces.

In this article, we will look at how Elasticsearch fits into an observability approach beyond simple logging, and how it enables technical signals to be correlated in order to better understand application behaviour.

What is observability, and why Elasticsearch is involved

Observability refers to the ability to understand the internal state of a system based on its external signals. Unlike traditional monitoring, it is not limited to predefined metrics or fixed thresholds.

Observability relies on collecting rich, contextual data, analysing it across multiple dimensions, and exploring situations that were not anticipated in advance. In this context, Elasticsearch plays a key role. Its indexing and search engine can analyse large volumes of heterogeneous data, structured or unstructured, in near real time, which aligns precisely with the needs of a modern observability approach.

The three pillars of observability: logs, metrics, and traces

An observability strategy is built on three complementary types of signals. Each addresses a different question and provides a specific perspective on system behaviour.

Logs: understanding what happened

Logs are events produced by applications and infrastructure components. In Elasticsearch, they are associated with a timestamp, either derived from the log event itself or from the ingestion time. They provide a high level of detail and make it possible to understand the precise context of an error, unexpected behaviour, or incident.

Elasticsearch has historically been well suited to this use case:

ingesting large volumes of data,
fast full-text search,
fine-grained event exploration.

Logs provide valuable context, but they become difficult to exploit on their own as architectures become more distributed and data volumes grow significantly.

Metrics: measuring system state

Metrics are numerical data aggregated over time. They describe the overall state of a system and make it possible to track its evolution. Latency, error rates, and resource consumption provide a high-level view of application or infrastructure health.

In Elasticsearch, these data are stored as time-series. This enables aggregations, long-term trend analysis, and anomaly detection, while still allowing metrics to be linked to other technical signals.

Traces: following a request end to end

Traces describe the full journey of a request through a distributed system. They are essential for understanding dependencies between services and for pinpointing the exact source of latency or errors.

Each trace is composed of multiple segments representing different execution steps. Once indexed in Elasticsearch, these traces can be correlated with associated logs and metrics, making it easier to analyse complex behaviours in microservices environments.

How Elasticsearch correlates logs, metrics, and traces

The value of observability does not lie in individual signals taken in isolation, but in their correlation. Elasticsearch facilitates this correlation through several structural mechanisms:

a shared indexing engine,
common schemas such as ECS (Elastic Common Schema), which provides a shared structure for logs, metrics, and traces,
cross-signal search capabilities.

In practice, this approach makes it possible to navigate naturally between signals. An alert triggered by a metric can lead to the analysis of related traces, followed by the exploration of logs associated with a specific request. Kibana plays a central role by making these correlations visible and actionable, through visualisations, dashboards, and exploration tools designed for cross-signal analysis.

Historically, Elasticsearch is best known for powering application search engines, particularly for indexing and querying website content. The same principles of fast, contextual search apply to observability data: logs, metrics, and traces are also indexed and queried as datasets, which makes large-scale exploration and correlation possible.

OpenTelemetry: a key standard for observability with Elasticsearch

In modern architectures, data collection is just as important as data analysis. OpenTelemetry has emerged as an open standard for application instrumentation, covering traces, metrics, and logs.

Elasticsearch natively supports OpenTelemetry, enabling signal collection to be standardised without relying on proprietary formats. This compatibility improves interoperability, reduces technological lock-in, and allows observability tooling to evolve without requiring changes to existing application instrumentation.

Observing your applications with Elastic on Clever Cloud

In a PaaS hosting context, observability must remain easy to enable and simple to operate. On Clever Cloud, Elasticsearch is available as a managed add-on. Applications can send their logs using Elasticsearch drains, enabling automatic centralisation of application logs. Several components can then be enabled depending on requirements:

a managed Elasticsearch cluster,
Kibana for exploration and visualisation,
Elastic APM for application performance analysis.

This approach makes it possible to centralise application logs, collect relevant metrics, and trace requests without having to manage the underlying infrastructure. The goal is not to multiply tools, but to provide a coherent observability foundation integrated into the application lifecycle.

Discover Elasticsearch to monitor your apps on Clever Cloud

Conclusion

Observability is not about stacking monitoring tools. It is about correlating logs, metrics, and traces in order to understand increasingly complex systems.

Thanks to its indexing, search, and analysis capabilities, Elasticsearch provides a solid technical foundation for this approach. Combined with open standards and interfaces such as Kibana, it enables teams to move from fragmented visibility to a comprehensive understanding of application behaviour.

In modern cloud environments, this correlation is no longer a luxury. It is a necessary condition for operating production systems reliably.