NVIDIA UFM Enterprise REST API Guide

UFM Dynamic Telemetry Instances REST API

The management of dynamic telemetry instances involves the facilitation of user requests for the creation of multiple telemetry instances. As part of this process, the UFM enables users to establish new UFM Telemetry instances according to their preferred counters and configurations. These instances are not initiated by the UFM but rather are monitored for their operational status through the use of the UFM Telemetry bring-up tool.

Instantiate a New Instance

  • Description: Instantiates a new telemetry instance per the requested configuration in the request parameters

  • URL: POST https://10.209.36.126/ufmRestV2/app/telemetry/instances/cset_name 

  • Request Data:

    Parameter

    Description

    requested_guids

    An array of objects, where each object specifies the node GUID and ports of the requested GUID

    guid

    A string specifying the unique identifier (node GUID) of the requested metrics

    ports

    An array of integers specifying the ports of the requested GUID.

    counters

    An array of strings specifying the names of the metrics counters to be retrieved – only supported counters can be sent (can be retrieved via the supported counters API).

    configuration

    An optional object specifying additional configuration parameters.

    sample_rate

    An integer specifying the rate at which the metrics are sampled.

    base_config

    An optional string specifying the base configuration to be used.

    ttl

    An optional string specifying the time-to-live (TTL) for the metrics data.

    is_registered_discovery

    An optional boolean value indicating whether the metrics are registered with the discovery service.

    is_async

    An optional boolean value. If this parameter is sent, the creation will become asynchronous, and a job_id will be returned. To get the status of this job, please refer to the Jobs API. We recommend using this parameter.

  • Response: Port number to communicate with the instantiated new instance.

  • Request Example: 

    Content-Type: application/json 
    { 
      "requested_guids": [ 
        { 
          "guid": "xyz123", 
          "ports": [8080, 8081, 8082] 
        }, 
        { 
          "guid": "abc456", 
          "ports": [9090] 
        } 
      ], 
      "counters": ["cpu", "memory"], 
      "configuration": { 
        "setting1": "value1", 
        "setting2": "value2" 
      }, 
      "sample_rate": 5, 
      "base_config": "config1", 
      "ttl": "24h", 
      "is_registered_discovery": true 
    } 
    The API will return a port that will be exposed by the UFM Telemetry. 
    Get All Instances
    GET https://10.209.36.126/ufmRestV2/app/telemetry/instances 
    Return list of all instances + configuration + ports 
    { 
        "<cset_name>": { 
            "name": " <cset_name> ", 
            "requested_guids": [ 
                { 
                    "guid": "248a0703008dae46", 
                    "ports": [ 
                        1 
                    ] 
                } 
            ], 
            "counters": [ 
                "PortXmitDataExtended", 
                "PortRcvDataExtended" 
            ], 
            "sample_rate": 20, 
            "ttl": "1h", 
            "base_config": "", 
            "endpoint_port": 9007, 
            "status": "", 
            "is_registered_discovery": true, 
            "root_dir": "/opt/ufm/files/dynamic_telemetry/ <cset_name> ", 
            "configuration": { 
                "num_iterations": "20000", 
                "plugin_env_CLX_EXPORT_API_SHOW_STATISTICS": 1, 
                "plugin_env_UFM_TELEMETRY_MANAGED_MODE": 1 
            }, 
            "conf_file": "", 
            "hca": "mlx5_0", 
            "pid": 7837 
        } 
    } 
    

Get Specific Instance Configuration

  • Description: Gets a specific instance configuration.

  • URL: GET https://10.209.36.126/ufmRestV2/app/telemetry/instances/cset_name  

  • Request Data: N/A

  • Response Example: 

    {
        "pdr_dynamic": {
            "name": "pdr_dynamic",
            "requested_guids": [
                {
                    "guid": "248a0703008fa280",
                    "ports": [
                        1,
                        1,
                        1,
                        1
                    ]
                },
                {
                    "guid": "ec0d9a0300bf551c",
                    "ports": [
                        1
                    ]
                },
                {
                    "guid": "e8ebd3030064b7c6",
                    "ports": [
                        1,
                        1
                    ]
                },
                {
                    "guid": "043f720300b818a0",
                    "ports": [
                        39
                    ]
                },
                {
                    "guid": "7cfe900300d5ba54",
                    "ports": [
                        1,
                        1,
                        1
                    ]
                },
                {
                    "guid": "98039b03009fce76",
                    "ports": [
                        1
                    ]
                }
            ],
            "counters": [
                "phy_raw_errors_lane0",
                "phy_raw_errors_lane1",
                "phy_raw_errors_lane2",
                "phy_raw_errors_lane3",
                "phy_effective_errors",
                "phy_symbol_errors",
            ],
            "sample_rate": 300,
            "ttl": "10000d",
            "base_config": "",
            "endpoint_port": 9007,
            "status": {
                "managed_mode": true,
                "start_time": 1683039674.951503,
                "num_ports": 29,
                "status": "running",
                "iteration_time_sec": 0.274126,
                "export_time_sec": 0.000279,
                "port_counters_time_sec": 0.010115,
                "ports_per_sec": 2867.029164607019,
                "timestamp": 1683093341.727322
            },
            "is_registered_discovery": true,
            "root_dir": "/opt/ufm/files/dynamic_telemetry/pdr_dynamic",
            "configuration": {
                "plugin_env_UFM_TELEMETRY_MANAGED_MODE": 1,
                "plugin_env_CLX_EXPORT_API_SHOW_STATISTICS": 1
            },
            "conf_file": "",
            "hca": "mlx5_0",
            "pid": 3662593
        }
    }
    

Change Running Instance

  • Description: Modifies the run configuration of an active telemetry instance. Specifically, the user is permitted to alter a specific set of GUIDs and the sample rate in their request.

  • URL: PUT 

    https://10.209.36.126/ufmRestV2/app/telemetry/instances/cset_name

      

  • Request Data: 

    Content-Type: application/json 
    { 
      "requested_guids": [ 
        { 
          "guid": "1234", 
          "ports": [5, 1] 
        }, 
        { 
          "guid": "5678", 
          "ports": [8] 
        } 
      ], 
      "sample_rate": 5 
    } 
    

Get All Instances Status

  • Description: Returns the running status and statistics of the started instances 

  • URL: GET

    https://10.209.36.126/ufmRestV2/app/telemetry/instances/

    status  

  • Request Data: N/A

  • Response Example:

    { 
        "dror": { 
            "managed_mode": true, 
            "start_time": 1681422289.418903, 
            "num_ports": 1, 
            "status": "running", 
            "iteration_time_sec": 0.026844, 
            "export_time_sec": 9.4e-5, 
            "port_counters_time_sec": 0.00068, 
            "ports_per_sec": 1470.5882352941176, 
            "timestamp": 1681422417.825401 
        } 
    } 
    

Pause Running Instance

Continue Running a Stopped Instance

{ 
  "requested_guids": [ 
    { 
      "guid": "1234", 
      "ports": [5, 1] 
    }, 
    { 
      "guid": "5678", 
      "ports": [8] 
    } 
  ], 
  "sample_rate": 5, 
"ttl": “300d”, 
} 
  • Request Data: N/A 

  • Response Example: N/A

Get Supported Counters

Delete a Running Instance

Last updated: