Skip to content

🧱 Observability Stack Architecture

This document outlines the architecture of a modern observability stack using:

  • Prometheus for metrics
  • Loki for logs
  • Tempo for traces
  • Grafana as the unified dashboard

This stack provides full visibility into your application's health, behavior, and performance.


🧭 Overview

πŸ” Observability answers:

  • What is happening? (Metrics)
  • Why is it happening? (Logs)
  • Where did it happen? (Traces)

🧊 Core Components

Component Purpose Visualization
Prometheus Scrapes metrics from applications Grafana
Loki Collects and indexes logs Grafana
Tempo Collects distributed traces Grafana
Grafana Visualizes all of the above Web UI

🧬 Architecture Diagram

                        +--------------------------+
                        |      Grafana Dashboards  |
                        |   (Metrics / Logs / Traces) |
                        +-----------+--------------+
                                    |
      +-----------------------------+-----------------------------+
      |                             |                             |
+-------------+           +------------------+         +------------------+
| Prometheus  | <---scrapes--- |   FastAPI App   | --traces--> |     Tempo         |
| (metrics)   |               | (via /metrics)  |             | (tracing backend) |
+-------------+               +------------------+             +------------------+
                                    |
                             logs via stdout
                                    |
                             +--------------+
                             |     Promtail |
                             | (or Grafana Agent) |
                             +--------------+
                                    |
                               +--------+
                               |  Loki  |
                               +--------+

βš™οΈ Data Flow Summary

  1. Metrics (Prometheus):

    • Your FastAPI app exposes metrics via /metrics (e.g., with prometheus_client)
    • Prometheus scrapes them periodically
    • Grafana queries and visualizes them
  2. Logs (Loki):

    • Application logs (e.g., loguru, structlog, uvicorn) go to stdout
    • Promtail or Grafana Agent tails logs and sends them to Loki
    • Logs are labeled (e.g., by service, pod, env) and searchable in Grafana
  3. Traces (Tempo):

    • FastAPI app is instrumented using OpenTelemetry
    • Requests generate traces (including spans for DB, HTTP, etc.)
    • Traces are exported via OTLP to Tempo
    • Tempo stores and indexes traces for viewing in Grafana

🧩 Component Integration

Tool Input Source Output / Integration
Prometheus /metrics endpoint Grafana dashboards, Alerts
Loki Logs from stdout/stderr Grafana Explore
Tempo OpenTelemetry SDK Grafana trace viewer
Grafana Prometheus, Loki, Tempo Unified view

βœ… Benefits of This Stack

  • Single pane of glass: All telemetry in one UI
  • Correlated insights: Link logs ↔ traces ↔ metrics
  • Open source and cloud-native
  • Minimal vendor lock-in (fully OSS or self-hostable)

πŸ› οΈ Example Use Case

Issue: Latency spike on /api/orders

With this stack you can:

  1. Use Prometheus to see when latency increased
  2. Click a spike in Grafana β†’ view Tempo trace
  3. Trace shows DB query took 1.2s
  4. Jump to Loki logs for same trace_id
  5. See error log: β€œIndex on orders.created_at missing”

βœ… You identified what, where, and why β€” across tools in seconds.


πŸ“š Next Steps