Monitoring

Centralized observability with Prometheus metrics, Grafana dashboards, and configurable alerting — scoped to your environment.

Table of contents
  1. Metrics Architecture
  2. Grafana Dashboards
    1. Canton Network Dashboards
    2. Infrastructure Dashboards
  3. Metrics Collection
    1. Canton Node Metrics
    2. Kubernetes Metrics
  4. Alerting
    1. Notification Channels
    2. Alert Examples
  5. Client Access
    1. What You See
    2. Access Control
  6. Retention and Storage

Metrics Architecture

Metrics flow from your Canton nodes through a centralized collection and visualization pipeline:

graph LR
    subgraph env["Environment Cluster"]
        Validator[Validator<br/>:10013] -->|scrape| Agent[Prometheus<br/>Agent]
        Participant[Participant<br/>:10013] -->|scrape| Agent
        Kubelet[Kubelet /<br/>cAdvisor] -->|scrape| Agent
    end

    Agent -->|remote write| Central[Central<br/>Prometheus]
    Central --> Grafana[Grafana<br/>Dashboards]
    Central --> Alertmanager[Alertmanager]
    Alertmanager -->|notify| Channels[Email / Slack /<br/>PagerDuty]
  • Prometheus Agent runs on each environment cluster, scraping metrics every 30 seconds
  • Metrics are shipped via remote write to the central Prometheus instance on the shared cluster
  • Grafana provides visualization dashboards backed by the central metrics store
  • Alertmanager evaluates alert rules and routes notifications to configured channels

Grafana Dashboards

Pre-configured dashboards provide visibility into your Canton nodes and underlying infrastructure:

Canton Network Dashboards

Dashboard What It Shows
Health Overall node health status, uptime, and connectivity indicators
Participant Metrics Ledger API throughput, latency, active contracts, transaction rates
Sequencer Client Sequencer connection status, message processing, and lag
Synchronizer Fees (Validator) Fee collection, reward distribution, and token balances
Validator Licenses License status and validator network participation

Infrastructure Dashboards

Dashboard What It Shows
Kubernetes Cluster Overview Cluster-wide resource utilization, node status, and pod counts
Kubernetes Node Resources Per-node CPU, memory, disk, and network utilization
Kubernetes Pod Resources Per-pod resource consumption and limits
Kubernetes PVC Persistent volume capacity, usage, and growth trends
GCP Infrastructure Cloud-level metrics from Google Cloud Platform

Metrics Collection

Canton Node Metrics

Both Validator and Participant nodes expose Prometheus metrics on port 10013:

  • Transaction metrics — submission rates, confirmation times, rejection counts
  • Ledger metrics — active contract count, ledger end offset, pruning status
  • Connection metrics — sequencer connectivity, domain connections
  • JVM metrics — heap usage, garbage collection, thread counts
  • gRPC metrics — API call rates, latencies, error rates

Kubernetes Metrics

Standard Kubernetes metrics are collected automatically:

  • Container resources — CPU, memory, and network per container
  • Pod lifecycle — restarts, scheduling latency, readiness
  • Storage — persistent volume utilization and I/O metrics
  • Cluster health — node conditions, API server responsiveness

Alerting

Alertmanager processes alert rules and delivers notifications through your preferred channels:

Notification Channels

  • Email — direct notifications to your operations team
  • Slack — channel-based alerts for team visibility
  • PagerDuty — incident management integration for critical alerts
  • Webhooks — custom integrations with your existing systems

Alert Examples

Alert Severity Condition
Node unhealthy Critical Validator or participant pod not ready for > 5 minutes
High resource usage Warning CPU or memory exceeding 80% of limits
Database storage low Warning PostgreSQL PVC usage above 85%
Certificate expiring Warning TLS certificate expires within 14 days
Backup failure Critical Scheduled backup job did not complete

Client Access

What You See

Each client has access to metrics scoped to their own namespace:

  • Grafana dashboards filtered to your environment and namespace
  • Prometheus metrics accessible via authenticated queries
  • No visibility into other clients’ data or platform internals

Access Control

  • Dashboard access is authenticated through Keycloak OIDC
  • Metrics endpoints are protected by Istio authorization policies with IP whitelisting
  • Only authorized IP addresses can reach the metrics scrape endpoints

Retention and Storage

Tier Retention Purpose
Prometheus Agent (per-cluster) 30 minutes Local buffer before remote write
Central Prometheus (shared cluster) Configurable Long-term metrics storage and query
Grafana N/A Visualization layer, queries Prometheus on demand

© 2026 Noders. All rights reserved.

This site uses Just the Docs, a documentation theme for Jekyll.