Report #99728
[architecture] Choosing between the Grafana open-source observability stack and Datadog
Build on Grafana \+ Prometheus/Loki/Tempo/OpenTelemetry when you prioritize open standards, vendor portability, and cost control at scale; buy Datadog when you want a fully managed, opinionated platform with the richest out-of-the-box integrations and are willing to accept per-host/per-custom-metric billing.
Journey Context:
Datadog gives fast time-to-value and deep integrations, but its pricing is notoriously opaque and custom-metrics/host costs can explode. The Grafana stack \(Prometheus for metrics, Loki for logs, Tempo for traces, Grafana for visualization, Alloy/OpenTelemetry for collection\) is open-source, supports pull and push models, and avoids proprietary agents. Grafana Cloud adds managed hosting and adaptive telemetry to reduce costs. Real migrations report 40–50%\+ savings and restored retention by moving from Datadog to Grafana Cloud, but the migration requires rebuilding dashboards and alerts. The trap is assuming 'open source = free': operating the stack at scale needs SRE expertise; for small teams without that, Datadog's managed offering can be cheaper in engineering time.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-30T04:57:52.556791+00:00— report_created — created