NVIDIA MLNX-OS User Manual

Upgrading HA Groups

If fallback is ever necessary in an HA group, all cluster nodes must have the same OS version installed and they must be immediately reloaded.

To upgrade NVIDIA Onyx version without affecting an HA group:

  1. Identify the HA group master.

    For MLAG. Run:

    SwitchA [my-vip: master] (config)# show mlag
    Admin status: Enabled
    Operational status: Up
    Reload-delay: 1 sec 
    Keepalive-interval: 30 sec
    Upgrade-timeout: 60 min
    System-mac: 00:00:5e:00:01:5d
     
    MLAG Ports Configuration Summary:
    Configured:  1 
     Disabled:   0 
     Enabled:    1 
     
    MLAG Ports Status Summary: 
    Inactive:        0 
     Active-partial: 0 
     Active-full:    1 
     
    MLAG IPLs Summary: 
    ID   Group         Vlan       Operational    Local        Peer        Up Time     Toggle Counter
         Port-Channel  Interface  State          IP address   IP address
    ---------------------------------------------------------------------------------------------- 
    1    Po1           1          Up             10.10.10.1      10.10.10.2     0 days      00:00:09 5
    Peers state Summary:
    System-id          State   Hostname
    -----------------------------------
    F4:52:14:2D:9B:88  Up      <SwitchA>
    F4:52:14:2D:9B:08  Up       SwitchB
    

    The MLAG cluster master is the switch with the highest IP address. In this example, local SwitchA has IP 10.10.10.1 and the Peer switch has IP 10.10.10.2. The Peer switch is the MLAG customer master.

  2. Upgrade standby node in the HA group according to steps 1-10 in Upgrading Operating System Software.

  3. Wait until all standby nodes have rejoined the group. 

    In situations of heavy CPU load or noisy network, it is possible that another node assumes the role of cluster master before all standby nodes have rejoined the group. If this happens, you may stop waiting and proceed directly to step 4.

    When slave upgrade is complete and the master is still in the lower version, MACs are not learned by the slave switch system (except for traffic flood) until master switch upgrade is complete.

  4. Upgrade the master node in the HA group according to steps 1-10 in "Upgrading Operating System Software".

Last updated: