IB Cluster Maintenance

Prerequisite

This section describes the required tools for executing the InfiniBand cluster maintenance and operational procedures.

  1. UFM - July 2023 SW Version: This entails UFM Enterprise and at least one instance of UFM Telemetry. UFM incorporates an embedded UFM Telemetry instance featuring 120 fundamental debug counters for each port. These counters are collected periodically and are, by default, accessible through an HTTP endpoint. UFM offers multiple mechanisms for pushing (streaming) UFM Telemetry and event streams. Additional information can be found in Retrieving UFM Issues for comprehensive insights.

  2. UFM Installation: Refer to the instructions for the desired UFM software.

UFM

Link to Installation Instructions

UFM Enterprise

UFM Enterprise Installation

UFM Enterprise Appliance

UFM Enterprise Appliance Software Upgrade

UFM Telemetry

UFM Telemetry Installation

For those opting to use their own server

UFM Installation Steps


 

Last updated: