Architecture
Overview
Our observability platform is built on Grafana Cloud, providing comprehensive monitoring, logging, and tracing capabilities.
Platform Structure
Grafana Cloud Organization Model
Within Grafana Cloud, we operate an organization that enables us to create multiple isolated stacks. Each stack is a complete, self-contained observability environment that includes:
Grafana instance for visualization and dashboarding
Mimir for metrics storage (Grafana Cloud's Prometheus implementation)
Loki for log aggregation and storage
Tempo for distributed tracing
Additional services as part of the Grafana Cloud stack
The complete list of services included in each stack can be found in the Grafana Cloud stack reference.
Stack Isolation
Each stack is completely siloed from others. This means that each customer's data is stored individually in their own dedicated stack, with no possibility of cross-contamination between customers.
Data Storage Components
Default Configuration: Managed Services
By default, we use Grafana Cloud's fully managed services for all data storage components:
Managed Mimir: Metrics storage and querying with Prometheus compatibility
Managed Loki: Log aggregation and storage
Managed Tempo: Distributed tracing storage
This managed approach eliminates operational overhead while providing enterprise-grade reliability and automatic scaling.
Alternative: Self-Hosted Components
For customers with specific requirements, we can host the data storage components (Mimir, Loki, and Tempo) ourselves. This is not our default configuration but may be appropriate when:
Regulatory requirements mandate data storage in specific geographic locations
Network isolation requires data to remain within customer-controlled infrastructure
Custom data retention policies differ significantly from managed service defaults
When self-hosting components, the Grafana instance typically remains in Grafana Cloud while Mimir, Loki, and Tempo run in dedicated, customer-specific or multi-tenant infrastructure. Data isolation is maintained with each deployment option but the complexity of the management of the infrastructure may vary.
Alerting Architecture
The Challenge: Central Management with Data Isolation
While stack isolation ensures customer data security, it creates an operational challenge: how do we manage alerting rules consistently across many customer stacks without duplicating configuration or risking drift?
Solution: Cross Stack Datasource
We use Grafana Cloud's Cross Stack Datasource feature to solve this challenge. This allows us to:
Query data from multiple customer stacks from a central alerting stack
Manage alert rules centrally for consistency and scale
Maintain complete data isolation - each alert evaluation queries only the specific customer's stack
Avoid extracting or re-storing customer data, preventing any possibility of cross-contamination
How It Works
The Cross Stack Datasource acts as a federation layer. When an alert needs to evaluate:
The central alerting system queries the specific customer's stack directly
Alert evaluation happens against that customer's isolated data
Results trigger notifications through the configured channels
No customer data is copied, aggregated, or stored centrally
This approach ensures that each customer's metrics and logs remain in their dedicated stack while enabling us to provide consistent, scalable alerting across all customers.

Last updated
Was this helpful?
