DOCA Platform Framework (DPF) Documentation

User-Managed Components

Introduction

The observability infrastructure components are deployed and managed by the user via Helm. These components run on the Host Cluster and provide metrics collection, log aggregation, and visualization for the entire DPF platform.

For installation instructions, see the Helm Prerequisites guide.

Components Overview

The monitoring stack consists of four main components:

  • Kube-Prometheus-Stack: Unified stack with Prometheus (metrics storage), Grafana (visualization), and Prometheus Operator

  • Loki: Log aggregation and storage backend

  • OpenTelemetry Collector: Log collection from DPU clusters via OTLP

  • Kube-State-Metrics: Exposes Host Cluster Kubernetes resource metrics

Integrating with an Existing Monitoring Stack

Users with an existing Prometheus and Grafana deployment (not installed via the DPF Helm charts) can integrate with DPF by configuring the following integration points:

Metrics Collection (Prometheus)

DPF exposes two types of metrics that Prometheus needs to scrape:

1. Kube-State-Metrics (KSM)

KSM exposes DPF Custom Resource metrics via a standard HTTP endpoint with a ServiceMonitor. If the existing Prometheus is managed by Prometheus Operator, ensure that:

  1. The Prometheus Operator CRDs (monitoring.coreos.com) are installed in the cluster

  2. The Prometheus instance is configured to discover ServiceMonitors in the dpf-operator-system namespace (via serviceMonitorNamespaceSelector and serviceMonitorSelector)

If Prometheus Operator is not used, configure a scrape job manually:

YAML
scrape_configs:
  - job_name: 'dpf-kube-state-metrics'
    kubernetes_sd_configs:
      - role: service
        namespaces:
          names: ['dpf-operator-system']
    relabel_configs:
      - source_labels: [__meta_kubernetes_service_label_app_kubernetes_io_name]
        regex: kube-state-metrics
        action: keep

2. DPF Controller Metrics (secure endpoint)

DPF controller pods expose metrics over HTTPS with bearer token authentication. Prometheus must authenticate using a ServiceAccount token and trust the cluster CA. The controllers are identified by the pod label dpu.nvidia.com/component matching *-controller-manager and expose metrics on a port named metrics.

YAML
scrape_configs:
  - job_name: 'doca-platform-framework'
    scrape_interval: 15s
    metrics_path: /metrics
    scheme: https
    authorization:
      type: Bearer
      credentials_file: /var/run/secrets/kubernetes.io/serviceaccount/token
    tls_config:
      ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
      insecure_skip_verify: true
    kubernetes_sd_configs:
      - role: pod
    relabel_configs:
      - source_labels: [__meta_kubernetes_pod_label_dpu_nvidia_com_component]
        action: keep
        regex: ".*-controller-manager"
      - source_labels: [__meta_kubernetes_pod_container_port_name]
        action: keep
        regex: metrics
      - source_labels: [__meta_kubernetes_namespace]
        action: replace
        target_label: namespace
      - source_labels: [__meta_kubernetes_pod_name]
        action: replace
        target_label: pod

The Prometheus ServiceAccount must have permissions to access the /metrics endpoint on the controller pods. When using the DPF Helm chart reference configuration, this is handled by the kube-prometheus-stack-prometheus ServiceAccount.

Dashboard Import (Grafana)

DPF dashboards are stored as JSON files in the Helm chart and deployed as ConfigMaps labeled with grafana_dashboard: "1". There are two ways to import them:

Option 1: Grafana Sidecar (automatic)

If Grafana is deployed with the sidecar container, configure it to watch for ConfigMaps with the grafana_dashboard: "1" label in the dpf-operator-system namespace. The DPF operator creates these ConfigMaps automatically.

Option 2: Manual Import

Download the dashboard JSON files from the DPF Helm chart dashboards directory and import them via the Grafana UI or API. See the Grafana Dashboards page for details on available dashboards.

Log Collection (Loki / OpenTelemetry)

DPU cluster logs are forwarded via OTLP by the DPF-operator-managed OpenTelemetry Collector on each DPU cluster. The target endpoint is configured in DPFOperatorConfig.spec.monitoring.openTelemetryCollector.logging.endpoint. Point this to any OTLP-compatible receiver (e.g., an existing OpenTelemetry Collector gateway, Grafana Alloy, or a cloud-hosted endpoint).

Kube-Prometheus-Stack

The Kube-Prometheus-Stack provides a unified monitoring solution combining Prometheus Operator, Prometheus, and Grafana.

DPF-Specific Configuration:

  • Automatic Dashboard Discovery: Grafana automatically discovers dashboards from ConfigMaps labeled with grafana_dashboard: "1". The DPF operator creates these ConfigMaps automatically.

  • Multi-Cluster Support: Configured with Grafana multicluster features and Prometheus cluster labeling (cluster: management)

  • ServiceMonitor Integration: Automatically scrapes metrics from components with ServiceMonitor resources

  • Control Plane Scheduling: Prometheus and Grafana are scheduled on control-plane nodes with appropriate tolerations

For usage and configuration details, see the kube-prometheus-stack documentation.

Included Grafana Dashboards

Grafana includes both the default kube-prometheus-stack dashboards for Kubernetes monitoring and DPF-specific dashboards.

Default Kubernetes Dashboards:

  • Node metrics, Pod metrics, Namespace resources, and more from kube-prometheus-stack

DPF-Specific Dashboards:

  1. DOCA Platform Framework State: High-level overview of the operator and its controllers, highlighting key metrics such as resource status, condition states, and time to readiness

  2. Controller Runtime Dashboard: Detailed metrics and visualizations for the controllers, including reconciliation times, queue depths, and error rates

  3. Kubernetes API Server Requests Dashboard: Monitors requests made to the Kubernetes API server, helping identify performance bottlenecks or excessive API usage

All dashboards are automatically deployed when Grafana is enabled and accessible through the Grafana web UI under the "Dashboards" section.

Grafana Admin Password

Option 1: Set manually in the Kube-Prometheus-Stack values file:

YAML
grafana:
  adminPassword: <your-password>

Option 2: Auto-generated password - retrieve after deployment:

Bash
kubectl -n dpf-operator-system get secret kube-prometheus-stack-grafana \
  -ojsonpath='{.data.admin-password}' | base64 -d

Storage Configuration

Persistence behavior depends on each chart component's values and the cluster's default StorageClass. This repository uses the local-path provisioner (storageClassName: local-path) for local clusters; for production, configure an explicit storageClassName, appropriate capacity, and a backup/replication strategy in the Kube-Prometheus-Stack values.

To configure storage, modify the storage configurations in the Kube-Prometheus-Stack values file:

Prometheus:

YAML
prometheus:
  prometheusSpec:
    storageSpec:
      volumeClaimTemplate:
        spec:
          storageClassName: <your-storage-class>
          accessModes: ["ReadWriteOnce"]
          resources:
            requests:
              storage: 8Gi

Grafana:

YAML
grafana:
  persistence:
    enabled: true
    storageClassName: <your-storage-class>

Make sure to replace <your-storage-class> with the appropriate storage class for your environment.

Control Plane Metrics

By default, kube-controller-manager and kube-scheduler bind their metrics endpoints to 127.0.0.1, and etcd exposes metrics on http://127.0.0.1:2381. To allow Prometheus to scrape them, configure these components to bind to 0.0.0.0.

The kube-apiserver exposes metrics on its secure port and is typically reachable via its Service; ensure your ServiceMonitor is configured with the correct TLS settings (e.g., tlsConfig with a valid CA and serverName) rather than changing the apiserver bind address. Avoid using insecureSkipVerify unless strictly necessary.

For kubeadm Clusters:

YAML
apiVersion: kubeadm.k8s.io/v1beta3
kind: ClusterConfiguration
controllerManager:
  extraArgs:
    bind-address: "0.0.0.0"
scheduler:
  extraArgs:
    bind-address: "0.0.0.0"
etcd:
  local:
    extraArgs:
      listen-metrics-urls: "http://0.0.0.0:2381"

Multi-Cluster Configuration

The DPF monitoring stack supports monitoring both the Host Cluster and Tenant Control Planes (Kamaji) simultaneously.

Grafana Multicluster Dashboard Support:

  • Enabled via sidecar.dashboards.multicluster.global.enabled: true

  • Allows dashboards to display and filter metrics across multiple clusters using cluster labels

Prometheus Cluster Labeling:

All metrics scraped by the Host Cluster's Prometheus instance are automatically labeled with a cluster label to identify their source. This follows the approach described in the Kamaji monitoring documentation.

The configuration uses two complementary approaches:

1. ServiceMonitor Relabeling for Control Plane Components:

The Host Cluster's control plane metrics are labeled via ServiceMonitor relabelings. Below is an example for kubeApiServer - add similar relabelings for all other built-in ServiceMonitors (coreDns, kubeProxy, kubeEtcd, kubeControllerManager, kubeScheduler, kubelet):

YAML
# Add cluster label to all built-in ServiceMonitors for Host Cluster
kubeApiServer:
  serviceMonitor:
    relabelings:
      - action: replace
        targetLabel: cluster
        replacement: management

2. Global External Labels for All Metrics:

YAML
prometheus:
  prometheusSpec:
    # Add cluster label to ALL metrics via external labels
    externalLabels:
      cluster: management

While control plane components have explicit cluster: management from ServiceMonitor relabelings (applied at scrape time), the externalLabels configuration ensures all metrics have the cluster label, including those without specific ServiceMonitor relabelings.

This ensures:

  • Host Cluster control plane (kube-apiserver, kube-controller-manager, kube-scheduler) are labeled with cluster: management

  • All other Host Cluster metrics (DPF operator, Kube-State-Metrics, kubelet, etc.) are also labeled with cluster: management

  • Kamaji tenant cluster control planes can be scraped with their own cluster labels (e.g., cluster: charlie) following the Kamaji ServiceMonitor examples

Loki

Loki provides log aggregation and storage for both Host Cluster and DPU cluster logs.

Log Sources:

  • Host Cluster logs: Collected by OpenTelemetry Collector (filelog receiver)

  • DPU Cluster logs: Forwarded by DPU OpenTelemetry Collectors via OTLP

Integration:

  • Logs are stored with cluster labels (cluster: management or cluster: dpucluster-<name>) for multi-cluster filtering

  • Accessible via Grafana's Explore interface using LogQL queries

For querying and configuration details, see the Loki documentation.

OpenTelemetry Collector

The OpenTelemetry Collector on the Host Cluster runs as a DaemonSet with two purposes:

  1. OTLP Receiver: Receives logs from DPU cluster OpenTelemetry Collectors (gRPC 4317, HTTP 4318, exposed via NodePort 30318)

  2. Local Log Collection: Collects logs from Host Cluster pods via filelog receiver

Exporters:

  • Logs to Loki via otlphttp exporter

  • Metrics to Prometheus via prometheusremotewrite exporter

For configuration details, see the OpenTelemetry Collector documentation.

Kube-State-Metrics

The Kube-State-Metrics instance deployed via helmfile monitors Host Cluster Kubernetes resources (Pods, Deployments, Nodes, etc.). This is separate from the operator-deployed Kube-State-Metrics that monitors DPU custom resources.

Difference from DPF-Operator-Managed KSM:

  • Helmfile KSM: Monitors standard Kubernetes resources on Host Cluster

  • Operator KSM: Monitors DPF custom resources (DPU, IPPool, ServiceChain) on DPU clusters

Both expose metrics to Prometheus via ServiceMonitor.

For more information, see the Kube-State-Metrics documentation.

Last updated: