NVIDIA UFM Enterprise User Manual

UFM Prime

Overview 

The UFM Prime feature allows for the management of large fabrics, consisting of multiple sites, within a single product.

This feature is comprised of two layers: UFM Multi-Subnet Provider and UFM Multi-Subnet Consumer.

The UFM Provider functions as a Multi-Subnet Provider, exposing all local InfiniBand fabric information to the UFM consumer. On the other hand, the UFM Consumer acts as a Multi-Subnet Consumer, collecting and aggregating data from currently configured UFM Providers, enabling users to manage multiple sites in one place. While UFM Consumer offers similar functionality to regular UFM, there are several behavioral differences related to aggregation.

Setting Up UFM Prime

In /opt/ufm/files/conf/gv.cfg, fill in the section named [Multisubnet] for UFM Multi-Subnet Provider and Consumer. 

To set up UFM as a MultI-Subnet Provider, perform the following:

  • Set multisubnet_enabled to true

  • Set multisubnet_role to provider

  • Set multisubnet_site_name (optional, if not set, it will be randomly generated); e.g., provider_1

  • Start UFM

To set up UFM as a Multi-Subnet Consumer, perform the following: 

  • Set multisubnet_enabled to True

  • Set multisubnet_role to consumer

  • Start UFM

It is important to note that UFM Multi-Subnet Consumer can be configured on a machine or VM without an established InfiniBand connectivity. Additionally, users may customize UFM Provider and Consumer using optional configuration parameters found in the [Multisubnet] section of /opt/ufm/files/conf/gv.cfg

Network Configuration for UFM Multi-Subnet Provider

For the UFM Consumer to successfully collect data from a UFM Provider, the following network prerequisites must be met on the provider machine.

Required Ports

The consumer connects to the provider over HTTP on two ports:

Port

Purpose

Configuration Parameter

7102 (default)

Topology data

multisubnet_topology_provider_port

9001 (default)

Telemetry data (Prometheus endpoint)

prometheus_port

Both ports must be open and reachable from the consumer machine. If your environment uses a firewall, ensure inbound traffic on these ports is allowed. For UFM Appliance deployments, refer to the NVIDIA UFM Appliance User Manual for instructions on configuring firewall rules.

Telemetry Binding Address

By default, the telemetry endpoint (primary_ip_bind_addr in the [Telemetry] section of /opt/ufm/files/conf/gv.cfg) is bound to 127.0.0.1, which restricts it to local connections only. The topology provider (port 7102) already listens on all interfaces and requires no change, but the telemetry endpoint does. On the provider, change:
primary_ip_bind_addr = 0.0.0.0

This makes the telemetry endpoint listen on all network interfaces, allowing the consumer to connect remotely. Alternatively, you can set it to a specific IP address that is routable from the consumer.

Warning: Setting primary_ip_bind_addr to 0.0.0.0 exposes the telemetry endpoint on all network interfaces. Ensure that appropriate firewall rules are in place to restrict access to trusted consumer machines only.

Functionality 

  1. Following the initial launch of the Consumer, the Dashboard view is devoid of data, and a message containing a hyperlink leading to the Provider Management section is displayed.

    MULTI-SUBNET1.png
    mullti-subnet1a.png

  2. As shown in the below snapshot, a new section for Provider Management has been added, enabling users to configure UFM Providers.
    multi-subnet2.png
    To add a provider, the user is required to enter its IP address and credentials. Unless there are multiple instances of UFM providers on a single machine, the advanced section parameters should be set with default values. However, if there are multiple instances, the advanced parameters may be set per Provider and then be configured in the Providers Management view.By editing the Provider view, you can change Provider's credentials.The "Delete Provider" function removes the selected Provider from the Consumer. Please note that this action may take some time to complete, and changes may only be reflected in the view after approximately 30 seconds. 

  3. A general filter has been added to the top right corner of the page, enabling users to filter displayed data by site.

    multi-subnet3.png  
    multi-subnet3a.png
    multi-subnet3b.png

    multi-subnet3c.png

  4. In case XDR router connecting subnets, for router in Site column will appear list of sites this router connecting.
    device_view_1.jpg
    For router in ports view will appear in Site column name of site this port connected to
    device_view_3.jpg

  5. Network map contains “clouds” for each provider.
    multi-subnet4.png
    multi-subnet4a.png

  6. If XDR router connecting Sites, the map will include Router as separate entity.

    network_map_1.jpg
    network_map_2.jpg
    network_map_3.jpg
    network_map_4.jpg

  7. A "Site Name" column is present in all Managed Elements sections. The column is disabled (hidden) by default. 
    multi-subnet5a.png
    multi-subnet5.png
    multi-subnet5b.png

  8. The "Group" and "Telemetry" sections include "Site" filters. 
    multi-subnet6.png

  9. The filter in "Groups" impacts the Members table only.

    multi-subnet7a.png multi-subnet7.png

  10. In the System Health tab, subsections for Consumer and Provider are available.

    1. Consumer System Health tab contains sections applicable to Consumer UFM specifically (e.g., logs from Consumer UFM).
      ulti-subnet8a.png

    2. Provider System Health contains sections applicable to one or multiple providers (e.g., Fabric Health Report can be triggered on multiple Providers from the Consumer). 
      ulti-subnet8b.png

  11. Topology compare view contains sub report tables for each provider.
    topology_compare_7.jpg
    To view report need to select specific one.
    topology_compare_9.jpg
    Custom topology compare could be run once selected specific site.
    topology_compare_5.jpg

  12. UFM Health tab contains sub report tables for each provider.
    multi-subnet9.png

  13. Fabric Health contains sub report tables for each provider. 
    multi-subnet10.png

  14. Daily Reports:

    1. Consumer Daily reports display consumer reports.

      ulti-subnet11a.png
    2. Providers Daily reports display reports from all providers.

      ulti-subnet11b.png
  15. The "Fabric Validation" tab contains sub report tables for each provider.
    ulti-subnet12.png

  16. In "UFM Logs" Tab: 

    1. Consumer logs:

      ulti-subnet13a.png
    2. Providers logs display providers log separately, displaying logs for all providers is not supported. 

      mullti-subnet13b.png
  17. In the "System Dump" tab: 

    1. "Consumer System Dump" collects system dump for consumer 
        multi-subnet14b.png

    2. "Providers System Dump" collect system dumps for one or all providers and mergeS them into one folder 
      multi-subnet14a.png

  18. Under "Settings", subsections for Consumer and Provider are available. 

    1. "Consumer Settings" contain sections applicable to Consumer UFM specifically (e.g., creation of access tokens for UFM consumer authentication);
      multi-subnet15b.png

    2. "Provider Settings" contain sections applicable to one or multiple providers (e.g., Event Policies can be changed for multiple Providers at once from the Consumer).
        multi-subnet15a.png

  

 

Last updated: