Networking Solutions

HowTo Install NVIDIA Firmware Tools (MFT) on VMware ESXi 8.0

Created on Mar 21, 2023

On This Page

Introduction

This  post describes the procedure of how to install and run NVIDIA® Firmware tools (MFT) on VMware ESXi 8.0 version.

References

Overview

The NVIDIA Firmware Tools (MFT) package is a set of firmware management tools used to:

  • Generate a standard or customized NVIDIA firmware image

  • Querying for firmware information

  • Burn a firmware image

Hardware and Software Requirements

  • A server platform with an ConnectX®-6 Dx adapter card.

  • Installer Privileges: The installation requires administrator privileges on the target machine.

Setup

The setup includes:

VMware ESXi server, vSphere Cluster and vCenter install and configuration is out of the scope of this post.

Installation

In first step we need to add additional Mellanox Firmware Tools depot to vSphere Cluster image in VMware Lifecycle Manager (vLCM).

To add the Mellanox Firmware Tools (MFT) depot to the image.

  1. Download Mellanox Firmware Tools 4.22.1 version from the MFT web page: Mellanox Firmware Tools (MFT) (nvidia.com).
    Cluster Configuration 04a.png

  2. Open a browser, connect to vSphere web interface at https://<vcenter_fqdn>, and login with the administrator@vsphere.local account.
    Cluster Configuration 00a.png

  3. At the Inventory tab, select the cluster, then select the Updates tab. Select Image and check LCM compliance.
    Cluster Configuration 00.png

  4. On the top left menu, click on the tree lines then select Lifecycle Manager.
    Cluster Configuration 04.png

  5. Click on Action, then select Import Updates.
    Cluster Configuration 05.png

  6. At the Import Updates popup, click on Browse.
    Cluster Configuration 06.png

  7. At the Open popup, select the Mellanox depot, then click Open.
    Cluster Configuration 07.png

  8. Repeat steps 5 to 7 for the second depot bundle.
    Cluster Configuration 07b.png
    Cluster Configuration 08.png

  9. At the Inventory tab, select the cluster, then select the Updates tab. Select Image, then click on Edit.
    Cluster Configuration 09.png

  10. Click on Show details.
    Cluster Configuration 10.png

  11. Click on ADD COMPONENTS.
    Cluster Configuration 11.png

  12. Select the Mellanox depots and click SELECT.
    Cluster Configuration 12.png

  13. Click SAVE.
    Cluster Configuration 13.png

  14. A compliancy check will starting automaticaly.
    Cluster Configuration 14.png

  15. Click on REMEDIATE ALL to start MFT install on hosts.
    Cluster Configuration 15.png

  16. Click START REMEDIATION.
    Cluster Configuration 16.png
    Cluster Configuration 17.png

  17. To Enter a host to Maintenance mode maybe you need to power off vCLS VMs on host manualy.
    Cluster Configuration 18.png
    Cluster Configuration 19.png
    Cluster Configuration 20.png

  18. All host have now MFT tools installed.
    Cluster Configuration 22.png

Verification

  1. Enable SSH Access to ESXi server.

  2. Log into ESXi console with root permissions.

  3. Start the mst driver.

    ESXi console

    [root@clx-host-153:~] /opt/mellanox/bin/mst start
    Module mst is already loaded
    [root@clx-host-153:~]
    
  4. Check the current status of NVIDIA devices.

    ESXi console

    [root@clx-host-153:~] /opt/mellanox/bin/mst status
    PCI devices:
    ------------
    DEVICE_TYPE             MST                           PCI       RDMA            NET                       NUMA
    ConnectX6DX(rev:0)      mt4125_pciconf0              39:00.0
    
    ConnectX6DX(rev:0)      mt4125_pciconf0.1            39:00.1
    
    [root@clx-host-153:~]
    
  5. Query the device information.

    ESXi console

    [root@clx-host-153:~] /opt/mellanox/bin/mlxfwmanager --query
    Querying Mellanox devices firmware ...
    
    Device #1:
    ----------
    
      Device Type:      ConnectX6DX
      Part Number:      MCX623106AC-CDA_Ax
      Description:      ConnectX-6 Dx EN adapter card; 100GbE; Dual-port QSFP56; PCIe 4.0 x16; Crypto and Secure Boot
      PSID:             MT_0000000436
      PCI Device Name:  mt4125_pciconf6
      Base GUID:        0c42a103002404ea
      Base MAC:         0c42a12404ea
      Versions:         Current        Available
         FW             22.30.1004     N/A
         PXE            3.6.0301       N/A
         UEFI           14.23.0017     N/A
    
      Status:           No matching image found 
    
    [root@clx-host-153:~]
    

Appendix A

mst Synopsis

mst [switches]

Commands and Switches Description:

ESXi cli
mst start       						# Create special files that represent Mellanox devices in directory/dev. Load appropriate modules. After successfully completing this command, the mst driver will be ready to work.
mst stop       							# Stop Mellanox mst driver service and unload the kernel modules.
mst restart    							# "mst stop" followed by "mst start"
mst server start [-p|--port port]       # Start mst server to allow incoming connection. Default port is 23108.
mst server stop 						# Stop the mst server.
mst status 								# Print current status of Mellanox devices. Options: -v run with a high verbosity level (print more info on each device)
mst version 							# Print the version info


Done !

Authors


BK.jpg

Boris Kovalev

Boris Kovalev has worked for the past several years as a Solutions Architect, focusing on NVIDIA Networking/Mellanox technology, and is responsible for complex machine learning, Big Data and advanced VMware-based cloud research and design. Boris previously spent more than 20 years as a senior consultant and solutions architect at multiple companies, most recently at VMware. He has written multiple reference designs covering VMware, machine learning, Kubernetes, and container solutions which are available at the NVIDIA Documents website.




Last updated: