ConnectX-7 Firmware Release Notes

Bug Fixes History


This section includes history of 3 major releases back. For older releases history, please refer to the relevant firmware versions.


Internal Ref.

Issue

3959470

Description: Fixed a misconfiguration in OVS when RTTs are sent on a different priority that affected Congestion Control algorithm. This happened when the Round Trip Time (RTT) Congestion Control internal packets did not reach SW, even when flow is software offload (and the packets were not moved yet to the hardware offload by the OVS).

To solve the issue, now such packets are sent to the SW when they are SW offloaded.

Keywords: Round Trip Time (RTT) Congestion Control

Discovered in Version:

28.39.3004

Fixed in Release:

28.39.3004

3887760

Description: Fixed an issue that caused Completion Timeout to mistakenly be treated as Advisory Non-Fatal error. Now Completion Timeout is treated as uncorrectable error.

Keywords: Completion Timeout, Advisory Non-Fatal error

Discovered in Version:

28.39.3004

Fixed in Release:

28.39.3004

3887774

Description: Fixed an issue that prevented PLDM command Get Schema URI from functioning properly when there were no base RDE resource IDs. 

Keywords: PLDM

Discovered in Version:

28.39.3004

Fixed in Release:

28.39.3560

3910366

Description: Fixed an issue that prevented RDE Port resource from showing 400Gb speed in CapableLinkSpeedGbps and in MaxSpeedGbps in some InfiniBand cards. 

Keywords: 400Gb, InfiniBand, RDE

Discovered in Version:

28.39.3004

Fixed in Release:

28.39.3560

3910366

Description: Fixed an issue where the CR_SPACE was open to any read operation, even though some reads could lock the gateway. Bad reads from CR_SPACE will now result in a bad_access error being returned.

Keywords: CR_SPACE, Gateway

Discovered in Version:

28.39.3004

Fixed in Release:

28.39.3560

3910368

Description: Blocked access to invalid CR-SPACE registers when the adapter cards are secured.

Keywords: CR-SPACE registers

Discovered in Version:

28.39.3004

Fixed in Release:

28.39.3560

3942112

Description: Fixed an issue that resulted in device assert when using DCBX CEE.

Keywords: DCBX

Discovered in Version:

28.39.3004

Fixed in Release:

28.39.3560

3818997

Description: Improved ZTR_RTTCC algorithm fairness when running with 4K MTU.

Keywords: PCC

Discovered in Version:

28.39.3004

Fixed in Release:

28.39.3560

3925691

Description: Fixed an issue that caused CNP or RTT counters to not wrap-around properly.

Keywords: CNP, RTT, counters

Discovered in Version:

28.39.3004

Fixed in Release:

28.39.3560

3929376

Description: Fixed an issue where Congestion Control could malfunction due to an invalid database.

Keywords: Congestion control

Discovered in Version:

28.39.3004

Fixed in Release:

28.39.3560

3832284

Description: Fixed an issue that resulted in CNP moderation's mlxconfig preventing the CC mechanism from working properly.

Keywords: Congestion control, CNP

Discovered in Version:

28.39.3004

Fixed in Release:

28.39.3560



Internal Ref.

Issue

3730282

Description: Added mlxconfig ROCE_CC_DCQCN_COMPATIBILITY_MODE for interoperability with different generations of HCAs, and ROCE_CC_CNP_MODERATION for different CNP moderation options.

Keywords: Congestion Control, DCQCN, CNP

Discovered in Version:

28.38.1002

Fixed in Release:

28.39.3004

3757772

Description: Changed the link speed setting behavior to be "full link speed" instead of the limited rate when in the InfiniBand mode and the Congestion Control does not have a valid database to use for the data.

Keywords: IB Congestion Control

Discovered in Version: 

28.37.1014

Fixed in Release: 

28.39.3004

3748944

Description: Fixed an issue that kept the adapter cards' quad ports UP when using breakout cables / QSFP-split-4. Now when a 4 alignment loss is noticed, the link in 25G/lane Ethernet is dropped.

Keywords: Quad ports, link up, breakout cables / QSFP-split-4

Discovered in Version: 

28.39.1002

Fixed in Release: 

28.39.3004

3748943

Description: Modified PCIe switch Downstream Port EQLZ.PH1 timing to 3ms.

Keywords: PCIe, EQLZ, Phase1

Discovered in Version: 

28.39.1002

Fixed in Release: 

28.39.3004

3699086

Description: Fixed a rare race condition in NODNIC teardown that caused commands to hang on regular PF. 

Keywords: NODNIC teardown

Discovered in Version: 

28.39.1002

Fixed in Release: 

28.39.3004

3770362

Description: Fixed an issue that prevented Congestion Control from behaving properly when GRH is used in traffic of an IB cluster.

Keywords: IB congestion control, CNP, SL

Discovered in Version: 

28.39.1002

Fixed in Release: 

28.39.3004

3748947

Description: Added back the Digital Feedforward Equalizer (DFFE) hardware component to improve the signal integrity link.

Keywords: Digital Feedforward Equalizer (DFFE)

Discovered in Version: 

28.36.2020

Fixed in Release: 

28.39.3004



Internal Ref.

Issue

3652874

Description: Fixed firmware measurements calculation.

Keywords: Firmware measurements calculation

Discovered in Version: 

28.38.1002

Fixed in Release: 

28.39.2048

3664415




Description: Fixed an issue that caused Live Migration to hang during the "save" stage.

Keywords: Live migration

Discovered in Version: 

28.38.1002

Fixed in Release: 

28.39.2048

3629353




Description: Fixed the cr_space in port configuration to prevent wrong timestamp of cqes.  

Keywords: Hardware timestamp

Discovered in Version: 

28.38.1002

Fixed in Release: 

28.39.2048

3582559




Description: Added support for LED scheme #2 to MCX750500B-0D0K / MCX750500B-0D00 adapter cards. 

Keywords: LED

Discovered in Version: 

28.38.1002

Fixed in Release: 

28.39.2048

3669258




Description: Fixed a rare issue that prevented changes in mlxconfig from taking effect upon warm reboot.

Keywords: mlxconfig

Discovered in Version: 

28.38.1002

Fixed in Release: 

28.39.2048

3670719 / 3676590




Description: Added a small delay after the power up process to fix an issue that occasionally caused the module to be unstable after the power up.

Keywords: Link up

Discovered in Version: 

28.38.1002

Fixed in Release: 

28.39.2048

3629562




Description: Fixed a code mismatch in the process of handling the cause to the link being down when the remote faults were received.

Keywords: Link down

Discovered in Version: 

28.38.1002

Fixed in Release: 

28.39.2048

3532508




Description: Fixed a wrong parameter in the cable info MAD that resulted in unnecessary messages in the log. 

Keywords: Cable info MAD

Discovered in Version: 

28.38.1002

Fixed in Release: 

28.39.2048

3634350



Description: Disabled PCI power event messages on OCP 3.0 adapter cards according to the spec requirements.

Keywords: PCI, OCP 3.0

Discovered in Version: 

28.38.1002

Fixed in Release: 

28.39.2048

3636714




Description: Fixed an issue that caused the buffer for PLDM firmware update that were pending NIC requests to not being properly locked in case of PLDM-over-NC-SI, and consequently being corrupted by other flows.

Keywords: PLDM, buffer

Discovered in Version: 

28.38.1002

Fixed in Release: 

28.39.2048

3592276




Description: Fixed an issue that prevent MSI Interrupts from being advertised correctly, resulting in the wrong MSI being sent.

Keywords: MSI

Discovered in Version: 

28.38.1002

Fixed in Release: 

28.39.2048

3605363



Description: "Get Temperature" OEM command now always returns a unified temperature.

Keywords: Temperature

Discovered in Version: 

28.38.1002

Fixed in Release: 

28.39.2048

3531972



Description: Changed the bar configuration algorithm so that the last update to the bar address will be the one that takes affect when the host configures the same bar address for two different PFs.

Keywords: Network Interface

Discovered in Version: 

28.38.1002

Fixed in Release: 

28.39.2048

3626872



Description: Fixed an issue that caused the firmware to miscalculate the value of the maximum current temperature measured from all the diodes (found in the Internal_sensor_curr_temp field). 

Keywords: Sensor, temperature

Discovered in Version: 

28.38.1002

Fixed in Release: 

28.39.2048

3544340 / 3537706 / 3639178



Description: Improved SPDM v1.0 compatibility. SPDM measurements signature additional fixes.

Keywords: SPDM

Discovered in Version: 

28.38.1002

Fixed in Release: 

28.39.2048

3587821




Description: Fixed a HW bug that resulted in transaction loss that when cache replacement transaction occurs in parallel to code transcoding.

Keywords: HW bug, transaction loss

Discovered in Version: 

28.38.1002

Fixed in Release: 

28.39.2048

3610861




Description: The eeprom module gets stuck in polling in 20% of the times after reset. To resolve the issue, a delay after config module to high power was added.

Keywords: Polling, module, reset

Discovered in Version: 

28.38.1002

Fixed in Release: 

28.39.2048

3507928



Description: Fixed a linkup failure issue that occurred when connecting to a 25GbE transceiver by clearing the PSI Aging before trying to open Tx power.

Keywords: Cables, PSI Aging, 25GbE transceiver

Discovered in Version: 

28.38.1002

Fixed in Release: 

28.39.2048

3602379




Description: The "Bad Signal Integrity" message seen after power cycle can be safely ignored. The user should monitor BER number.

Keywords: Bad Signal Integrity, BER

Discovered in Version: 

28.38.1002

Fixed in Release: 

28.39.2048

3605686



Description: Fixed a statics issue that caused the i2c access to module to lock and stuck the switch. 

Keywords: i2c, switch

Discovered in Version: 

28.38.1900

Fixed in Release: 

28.39.2048

3482251



Description: Added support for hairpin drop counter in  QUERY_VNIC_ENV command. 

Keywords: Hairpin

Discovered in Version: 

28.38.1002

Fixed in Release: 

28.39.2048

3539437



Description: Fixed an issue that prevented  the get_func_num_from_pci_func_num function from returning  the value "-1" for undefined function type.

Keywords: get_func_num_from_pci_func_num

Discovered in Version: 

28.38.1002

Fixed in Release: 

28.39.2048

3570478



Description: Fixed Signal-to-Noise Ratio (SNR) value calculation for correct readings from the MMA4Z00 optical cable module.

Keywords: SNR

Discovered in Version: 

28.38.1002

Fixed in Release: 

28.39.2048

3602169



Description: Added a locking mechanism to protect the firmware from a race condition between insertion and deletion of the same rule in parallel. Such behavior occasionally resulted in firmware accessing a memory that has already been released, thus causing IOMMU / translation error.

Note: This fix will not impact insertion rate for tables owned by SW steering.

Keywords: Firmware steering

Discovered in Version: 

28.38.1002

Fixed in Release: 

28.39.2048

3588515 / 3409806




Description: Fixed a race condition that led to a firmware assert upon driver removal, or when changing the ETH flow control scheme in case of a stress of larger than MTU ingress packets.  

Keywords: Race condition, firmware assert

Discovered in Version:

28.38.1002

Fixed in Release:

28.39.2048

3610169



Description: Fixed QoS Shaper handling behavior for non-transmitting applications.

Keywords: QoS Shaper

Discovered in Version:

28.38.1002

Fixed in Release:

28.39.2048



Internal Ref.

Issue

3537571

Description: Fixed SPDM measurements signature.

Keywords: SPDM

Discovered in Version: 

28.37.1014

Fixed in Release: 

28.38.1002

3439757



Description: Fixed an issue that prevented the system from detecting the PCIe device during slot DC power cycle tests.

Keywords: PCIe device, DC power cycle tests

Discovered in Version: 

28.37.1014

Fixed in Release: 

28.38.1002

3534473



Description: Added a new field/slot ID to PRS  pcie_cfg_data.pci_cfg_space.pciex.pcie_switch_ini_defined_base_slot_id = 3 to define a specific slot number for GPU bridge DSP.

Keywords: Slot ID

Discovered in Version: 

28.37.1014

Fixed in Release: 

28.38.1002

3331179



Description: Improved token calculation.

Keywords: Token calculation

Discovered in Version: 

28.37.1014

Fixed in Release: 

28.38.1002

3299420



Description: Upgrading from firmware v28.38.1014 and below to v

28.38.1002

no longer requires an upgrade to an intermediate version.

Keywords: Firmware upgrade

Discovered in Version: 

28.37.1014

Fixed in Release: 

28.38.1002

3394841



Description: Updated the plug in/out events' reporting method to report only when the last recorded event is the opposite of the current event.

Keywords: Port events

Discovered in Version: 

28.37.1014

Fixed in Release: 

28.38.1002

3469311



Description: Fixed the SPDM operations order according to the spec. v1.1.0.

Keywords: SPDM operations

Discovered in Version: 

28.37.1014

Fixed in Release: 

28.38.1002

3527987



Description: Added support for NC-SI channel on both ports.

Keywords: NC-SI channel

Discovered in Version: 

28.37.1014

Fixed in Release: 

28.38.1002

3459317



Description: Changed the protection mechanism for BAR configuration.

Keywords: BAR configuration

Discovered in Version: 

28.37.1014

Fixed in Release: 

28.38.1002

3345150



Description: Fixed an issue that caused a packet with invalid/bad padcount to be silently dropped instead of sending a bad nack error.

Keywords: Packet drop

Discovered in Version: 

28.37.1014

Fixed in Release: 

28.38.1002

3418627



Description: Fixed wrong credits configuration that occurred when MAX_ACC_OUT_READ was configured.

Keywords: Performance

Discovered in Version: 

28.37.1014

Fixed in Release: 

28.38.1002

3466088



Description: Update the SX root to work with driverless mode in vport0 gvmi teardown.

Keywords: Driverless mode

Discovered in Version: 

28.37.1014

Fixed in Release: 

28.38.1002

3487313



Description: Fixed a a rare deadlock case between 2 DC packets in the RX side.

Keywords: Firmware deadlock

Discovered in Version: 

28.37.1014

Fixed in Release: 

28.38.1002

3495889



Description: Fixed a QoS host port rate limit shaper inaccuracy that occurred when the shaper was configured via the QSHR access register.

Keywords: Port rate limit shaper

Discovered in Version: 

28.37.1014

Fixed in Release: 

28.38.1002

3449451



Description: When using ConnectX-7 adapter card as InfiniBand, the port must be configured to use the Auto-Negotiation mode.

Keywords: Auto-Negotiation, InfiniBand

Discovered in Version: 

28.37.1014

Fixed in Release: 

28.38.1002

Internal Ref.

Issue

3272599






Description: Removed the option to clear "Tx disable cap" for all non-baseT SFP modules.

Keywords: Tx disable cap

Discovered in Version:

28.36.1010

Fixed in Release: 

28.37.1014

3339087




Description: Added a split mask verification process to check whether or not a module is split in HCA. 

Keywords: Cables, split module

Discovered in Version:

28.36.1010

Fixed in Release: 

28.37.1014

3411270




Description: Fixed an issue that resulted in firmware crash when setting large payload length values (more than ~1500) in NC-SI command's header.

Keywords: NC-SI

Discovered in Version:

28.36.1010

Fixed in Release: 

28.37.1014

3405790




Description: Fixed an issue that resulted in the interface type being shown as "unsupported" in CMIS modules.

Keywords: CMIS

Discovered in Version:

28.36.1010

Fixed in Release: 

28.37.1014

3418889




Description: Updated the NEGOTIATE_ALGORITHMS response according to the SPDM specification.

Keywords: SPDM

Discovered in Version:

28.36.1010

Fixed in Release: 

28.37.1014

3409686




Description: Added the option to clear the DPC registers after warm reboot.

Keywords: DPC

Discovered in Version:

28.36.1010

Fixed in Release: 

28.37.1014

3411116




Description: Fixed the configuration of the TS1s sent by the DownStream port (DSP) when moving to EQLZ.ph2.

Keywords: DSP

Discovered in Version:

28.36.1010

Fixed in Release: 

28.37.1014

3138665




Description: Changed the initial Tx preset configuration for the DownStream port (DSP).

Keywords: Tx, DSP

Discovered in Version:

28.36.1010

Fixed in Release: 

28.37.1014

3138665



Description: PLDM firmware update process fails in case 1304 bytes chunk size is chosen.

Keywords: PLDM firmware update

Discovered in Version:

28.34.4000

Fixed in Release: 

28.37.1014

3336619



Description:  Fixed an issues that occurred during secure firmware update when decrypting and authenticating each chunk of data using its authentication tag. The issue appeared when the main code chunk was split between the user chunks and any GCM operation (e.g., flash read with decryption). This GCM operation broke the GCM context for main chunk authentication and therefore failed. 

Keywords: Secure firmware update, GCM, code chunk

Discovered in Version: 

28.36.1010

Fixed in Release: 

28.37.1014

3327847



Description: CNP received, handled, and ignored counters in the hardware counters cannot work after moving to Programmable Congestion Control mode.

Keywords: CNP, Programmable Congestion Control

Discovered in Version: 

28.36.1010

Fixed in Release: 

28.37.1014

3336610




Description: Fixed a rare issue that prevented the hardware from handling an error flow that occurred when accessing the DPA cluster L2 cache from the firmware processor. In this case the firmware processor hardware requested a VA=>PA translation from the internal mmio, and the address translation was broken by the mmio on the 4K page boundary.

Keywords: Error handling, mmio, firmware processor

Discovered in Version: 

28.36.1010

Fixed in Release: 

28.37.1014

3073517



Description: When connecting a ConnectX-7 adapter card to a ConnectX-5 or an NVIDIA Spectrum switch and trying to raise 10G/40G over 100G optics cable is not supported.

Keywords: Optical cables, ConnectX-5, NVIDIA Spectrum

Discovered in Version: 

28.33.4030

Fixed in Release: 

28.37.1014

3358994



Description: Fixed an issue that prevented the hardware from consuming Port-VL and credits, which consequently blocked traffic from being transmitted due to a race condition between the firmware and the hardware when accessing the chip memory (CR space).

Keywords: Firmware race, CR space, Port-VL

Discovered in Version: 

28.36.1010

Fixed in Release: 

28.37.1014

Internal Ref.

Issue

-

Description: Fixed an issue for adapter cards P/N MCX755106AS-HEAT that caused the link not to raise after changing both ports to Ethernet mode.

Keywords: Port type, link up

Discovered in Version: 

28.36.1010

Fixed in Release: 

28.36.1700

Internal Ref.

Issue

3923754 (NVbugs)



Description: Fixed an issue that caused the Downstream Port Containment (DPC) not to be exposed on the downstream ports of the top level PCIe switch in products supporting PCIe switch.

Keywords: PCIe Switch, DPC

Discovered in Version: 

28.35.1012

Fixed in Release: 

28.36.1010

3070480



Description: Fixed an issue that resulted in PRBS lock loss (PRBS_CHK_ERR_CNT_NO_CLR field is raising) when the PRBS mode was first configured on the ConnectX-7 adapter card and then on the Wedge400 switch.

Keywords: PRBS

Discovered in Version: 

28.35.1012

Fixed in Release: 

28.36.1010

3317621



Description: Fixed an issue that caused wqe_based_steering CQEs not to be generated upon an error.

Keywords: CQE

Discovered in Version: 

28.35.1012

Fixed in Release: 

28.36.1010

3239340



Description: Aligned RDE behavior to DSP0266 v1.15.0  table 23.

Keywords: RDE

Discovered in Version: 

28.35.1012

Fixed in Release: 

28.36.1010

3016801



Description: Fixed a rare issue that resulted in link not raising when connecting a ConnentX-7 adapter card to IXIA in PAM4 speeds.

Keywords: PAM4, IXIA, link up

Discovered in Version: 

28.35.1012

Fixed in Release: 

28.36.1010

3073517



Description: Fixed an issue that resulted in device link down, and the device not being able to get traffic, when moving between two states DETECT and POLLING CONFIG in RTL.

Keywords: RTL, link down, traffic

Discovered in Version: 

28.35.1012

Fixed in Release: 

28.36.1010

3073517



Description: When connecting a ConnectX-7 adapter card to a ConnectX-5 or an NVIDIA Spectrum switch, configuring first 10G/40G and then configuring back 100G we result in linkup failure.

Keywords: ConnectX-5, NVIDIA Spectrum, linkup

Discovered in Version: 

28.33.4030

Fixed in Release: 

28.36.1010

3077026





Description: When connecting with MMS4X00-NL400 transceiver at 200Gb/s, instability may be experienced upon link up. 

Keywords: Transceiver, Link Up

Discovered in Version:

28.34.1002

Fixed in Release: 

28.36.1010

3077026

Description: When connecting a ConnectX-7 adapter card to ConnectX-7 adapter card and one side is configured to RM Loopback, and the port is toggled, link flap maybe experienced. 

Keywords: Link flap

Discovered in Version:

28.34.1002

Fixed in Release: 

28.36.1010

3106146




Description: Live migration of MPV affiliated function pair is not supported when port numbers are changed. Each function should stay on the same port number as before migration.

Keywords: MPV live migration

Discovered in Version:

28.34.1002

Fixed in Release: 

28.36.1010

2169950




Description: When decapsulation on a packet occurs, the FCS indication is not calculated correctly.

Keywords: FCS

Discovered in Version:

28.34.1002

Fixed in Release: 

28.36.1010

3147219



Description: SPDM Get Measurements might return an invalid signature while executed without the included measurements (request param2 = 0).

Keywords: SPDM

Discovered in Version:

28.34.4000

Fixed in Release: 

28.36.1010

3147207



Description: The SPDM challenge command returns the hash of all the measurements without their headers.

Keywords: SPDM

Discovered in Version:

28.34.4000

Fixed in Release: 

28.36.1010

3261861




Description: Connecting an HDR device to an NDR device with Optical cables longer than 30m causes degradation in the bandwidth.

Keywords: HDR-to-NDR, cables

Discovered in Version:

28.35.1012

Fixed in Release: 

28.36.1010

3225504




Description: Enabled constant clock offset (visible using PPS out) when synchronizing the device using PTP in 25G or 10G port link speed.

Keywords: PTP, PPS offset

Discovered in Version: 

28.35.1012

Fixed in Release: 

28.36.1010

3288489




Description: Fixed an issue that caused the Pkey table not to be updated, and wrong value to be sent, when the MADs handled in a long process were sent using GLOBAL_GVMI instead of vport0_gvmi.

Keywords: Pkey

Discovered in Version: 

28.35.1012

Fixed in Release: 

28.36.1010

3283455



Description: Fixed a wrong lane mapping to serdes when selecting the OSFP port and using only 4 lanes.

Keywords: QSFP, lanes

Discovered in Version: 

28.35.1012

Fixed in Release: 

28.36.1010



Last updated: