NMX Telemetry (NMX-T) Documentation

Telemetry Collection Configuration

NMX Telemetry supports three levels of the configuration interface, suitable for the various deployment and usage modes: 

  • Configuration with NVOS/NVUE command line interface for the NVOS/NVUE users

  • Configuration via NMX-C for the standalone distributions of NMX cluster applications

  • Configuration with custom http interface for the package-based installation

All three levels of the configuration are eventually consistent.

NMX-T Configuration-Configuration flow.png

Configuration Unit and Format

  • Configuration unit is a fileentire configuration file is saved and loaded, including all the parametersParameters could be one or many configuration from (1.2.3-GB200-1.3) Telemetry Collection ConfigurationIf parameter is missing default value is implied

  • File content must be compliant to Linux INI format (https://en.wikipedia.org/wiki/INI_file), with the following restrictions:format “key = value”No support for sectionsNo support for hierarchyCase insensitiveSupport for commentsNo support for duplicate namesSupport for Quoted valuesSupport for Line continuationSupport for Escape characters

Example:

nvl_telemetry.enabled = true
nvl_telemetry.update_period = 30

Configuration Parameters

List of supported configuration parameters. Use Telemetry Collection Configuration to retrieve an up-to-date list of supported parameters.

Parameter

Type

Description

Default

connector.log_data

boolean

Dump telemetry data payloads to the connector's log output

false

NVL5 Telemetry confgiuration

nvl_telemetry.enabled

boolean

Enable or disable telemetry collection

true

nvl_telemetry.log_level

integer

Level of the log messaging 0 - disabled, 6 - Info, 7 - Debug

6

nvl_telemetry.update_period

integer

Telemetry sampling and update period in seconds

60

nvl_telemetry.collect.device.gpu

boolean

Enable or disable collection of the GPU counters

true

nvl_telemetry.collect.device.hca

boolean

Enable or disable collection of the HCA counters

true

nvl_telemetry.collect.device.router

boolean

Enable or disable collection of the router counters

true

nvl_telemetry.collect.device.switch

boolean

Enable or disable collection of the switch counters

true

nvl_telemetry.collect.down_port_counters

boolean

Let the telemetry to provide counters to the ports those are down

true

Telemetry generator configuration

nvl_telemetry.generator.enabled

boolean

Enable or disable telemetry data generator

false

nvl_telemetry.generator.profile.counters

integer

Number of counters in the data generation profile

1900

nvl_telemetry.generator.profile.fields

integer

Number of fields in each event type within the data generation profile

10

nvl_telemetry.generator.profile.ports

integer

Number of ports in the data generation profile

48

nvl_telemetry.generator.profile.types

integer

Number of event data types in the data generation profile

10

OpenTelemetry exporter configuration

nvl_telemetry.otlp.target

string

URL of an Open Telemetry receiver target


nvl_telemetry.otlp.counter_set

string

Name of the counter-set to apply to the Open Telemetry exporter


nvl_telemetry.otlp.field_set

string

Name of the field-set to apply to the Open Telemetry exporter


nvl_telemetry.otlp.inbound_queue

integer

Length of the intake processing queue in data blocks the Open Telemetry exporter operates on

10000

Prometheus remote write configuration

nvl_telemetry.remote_write.target

string

URL of a Prometheus Remote write receiver target


nvl_telemetry.remote_write.counter_set

string

Name of the counter-set to apply to the remote write exporter


nvl_telemetry.remote_write.field_set

string

Name of the field-set to apply to the remote write exporter


nvl_telemetry.remote_write.inbound_queue

integer

Length of the intake processing queue in data blocks the Prometheus Remote write exporter operates on

10000

gNMI aggregator configuration

gnmi_aggregator.enabled

boolean

Enable or disable gnmi aggregation

true

gnmi_aggregator.targets.0.host

string

Name or IP address of the gNMI target


gnmi_aggregator.targets.0.port

integer

Port number of the gNMI target

9339

gnmi_aggregator.targets.0.auth.user

string

User name for the basic auth


gnmi_aggregator.targets.0.auth.password

string

Password for the basic auth. See Telemetry Collection Configuration | id (1.2.3 GB200 1.3)TelemetryCollectionConfiguration SecretsEncoding below.


NVOS Command Line Interface

Configuration workflow reflected in the CLI interface commands:

  1. Reveal list of configuration files available to the NMX-C. Should indicate both nmx-telemetry and nmx-controller installed, component "telemetry" present in the description of nmx-controller.

    nv show cluster apps
    
  2. Get the configuration file.

    nv action generate sdn config app nmx-telemetry type telemetry
    
  3. Locate the copy of configuration file.

    nv show sdn config app nmx-telemetry type telemetry file
    

    File location would be similar to the following.

    /host/cluster_infra/app_config/nmx-telemetry/telemetry/nmx-telemetry_telemetry_20241202_113751
    
  4. Modify the location as needed.

    cat /host/cluster_infra/app_config/nmx-telemetry/telemetry/nmx-telemetry_telemetry_20241202_113751
    ...
    vim /host/cluster_infra/app_config/nmx-telemetry/telemetry/nmx-telemetry_telemetry_20241202_113751
    
  5. Upload the modified configuration by providing the name of the file

    nv action install sdn config app nmx-telemetry type telemetry files nmx-telemetry_telemetry_20241202_113751
    

NMX-Controller gRPC Interface

Mid-level configuration entry point, use NMX-C gRPC to retrieve and update the telemetry configuration.

Starting from NMX-C 0.8.0_2024-12-12, the default key/value separator for configuration tables is a space. For compatibility reasons, the following commands convert the space separator into an equal sign separator and vice versa. The default value must be updated.

  1. Register the client's identifier "test_client" to enable the gateway to recognize the client.

    docker run --network=host fullstorydev/grpcurl -plaintext -d \
        '{ "gatewayId": "test_client", "major_version": "PROTO_MSG_MAJOR_VERSION", "minor_version": "PROTO_MSG_MINOR_VERSION" }' \
        0.0.0.0:9372 nmxc_service.NMXCService.Hello
    
  2. Get the telemetry configuration file.

    docker run --network=host fullstorydev/grpcurl -plaintext -d \
        '{"configFiles": {"configFile": [{"configFileName": "telemetry"}]},"gatewayId": "test_client"}'\
         0.0.0.0:9372 nmxc_service.NMXCService.GetStaticConfig \
      | jq -r '.staticConfig.configFileContents.configFileContent[] | .configFileContent ' | sed -e 's/   / = /' | sort > telemetry.ini
    
  3. Set the configuration file.

    docker run --network=host fullstorydev/grpcurl -plaintext -d \
      "$(jq -R -s '{"gatewayId": "test_client","staticConfig": {"configFileContents": {"configFileContent": [{"configFileName": "telemetry","configFileContent": gsub(" = "; "   ")}]}}}' telemetry.ini)" \
      0.0.0.0:9372 nmxc_service.NMXCService.SetStaticConfig
    

NMX-Telemetry Control Interface 

Low-level configuration entry point, use the NMX-T runtime configuration interface to retrieve and update the telemetry configuration.

Get the Full Configuration File

  • In flatten INI format, compatible with NVOS

curl http://0.0.0.0:9350/telemetry/config
  • In JSON structured format

curl -H "Accept: application/json" http://0.0.0.0:9350/telemetry/config | jq

Upload the Full Configuration File

Using the POST method, the entire configuration state is replaced by the content of the file, with missing fields populated by default values.

  • In flatten INI format, compatible with NVOS

curl -X POST --data-binary @telemetry.ini http://0.0.0.0:9350/telemetry/config
  • In JSON structured format

curl -X POST -H "Content-Type: application/json" --data @telemetry.json http://0.0.0.0:9350/telemetry/config

Partial Configuration Update

Using the PUT method, the configuration state is updated with the file content, while leaving unspecified parameters unchanged.

  • In flatten INI format, compatible with NVOS

curl -X PUT --data-binary $'nvl_tlemetry.enabled = true \n nvl_tlemetry.sample_rate = 120' http://0.0.0.0:9350/telemetry/config
  • In JSON structured format

curl -X PUT  -H "Content-Type: application/json"  --data-binary '{"nvl_telemetry":{"enabled": true, "update_period" : 300}}}' http://0.0.0.0:9350/telemetry/config

List the Supported Configuration Parameters

Retrieve the list of parameters as a plain table.

curl http://0.0.0.0:9350/telemetry/config/params

Secrets Encoding

To ensure secure storage of sensitive authentication data - such as gNMI switch interface passwords - NMX-T includes a built-in secrets encoding tool.

Encoding a Secret

echo -n "password" | /usr/share/cluster_pkgs/nmx-telemetry/scripts/secrets.sh encode

Applying the Encoded Secret

Once encoded, insert the value into the telemetry configuration as follows:

curl -X PUT --data-binary $'gnmi_aggregator.targets.0.auth.password = ENCODEDSECRET' http://0.0.0.0:9350/telemetry/config

Last updated: