2024-08-20Hünkar Döner

Modern Observability: AWS + OpenTelemetry + Envoy Gateway

ObservabilityOpenTelemetryAWSEnvoyMonitoring
M

Modern Observability: AWS + OpenTelemetry + Envoy Gateway

Microservices and distributed systems are great, until something goes wrong. When a request fails, finding the root cause can be like looking for a needle in a haystack. Looking at old-school log files is no longer enough. What we need is: Observability.

In the modern world, observability rests on three pillars: Logs, Metrics, and Traces. In this article, we will explain how you can provide end-to-end visibility using the OpenTelemetry standard on AWS infrastructure.

What is OpenTelemetry (OTEL) and Why is it Important?

In the past, every monitoring tool (Datadog, New Relic, AWS X-Ray) had its own agent and library. This made you dependent on that tool (Vendor Lock-in). OpenTelemetry (OTEL) is an open-source project supported by CNCF that standardizes data collection.

  • Benefit: You instrument your application once with OTEL, and you can send the data to AWS X-Ray, Prometheus, or any other tool. You don't need to change your code.

AWS Distro for OpenTelemetry (ADOT)

AWS fully supports the OTEL project and offers a secure and performant distribution called ADOT, pre-integrated with AWS services (EKS, ECS, Lambda). The ADOT Collector securely transmits the collected data to Amazon CloudWatch and AWS X-Ray services.

Network Visibility with Envoy Gateway

Monitoring application code is not enough, you must also monitor network traffic. Envoy Gateway, your Kubernetes entry point, has native integration with OTEL.

  • Distributed Tracing: Envoy adds a unique Trace-ID to every incoming request. This ID is carried across all microservices the request passes through. Thus, you can visualize the journey of a request entering from Envoy, going to Service A, then Service B, and the database on the AWS X-Ray map. You can see at which step the latency occurred with millisecond precision.

How to Build an Architecture?

  1. Application Level: Add OpenTelemetry SDKs to your applications (Java, Python, Go).
  2. Infrastructure Level: Install ADOT Collector on your EKS cluster (as DaemonSet or Sidecar).
  3. Network Level: Enable tracing on Envoy Gateway and route data to the ADOT Collector.
  4. Visualization: Monitor all data on a single dashboard on AWS X-Ray and CloudWatch ServiceLens.

Observability is indispensable for system reliability. You can establish a modern monitoring infrastructure by utilizing our Kubernetes Consultancy and AWS Consultancy services to eliminate blind spots in your complex systems.