ConnectX-7 Firmware Release Notes

Bug Fixes History

This section includes history of 3 major releases back. For older releases history, please refer to the relevant firmware versions.

Internal Ref.

Issue

4578581 / 4626296

Description: Fixed an interoperability issue where, when ConnectX-7 communicates with ConnectX-8 using the probe-based algorithm, bandwidth could become extremely low due to probe packets being dropped.

Keywords: Interoperability, Congestion Control, RTT

Detected in version:

28.47.1026

Fixed in Release:

28.48.1000

4786813

Description: Fixed an issue where the DPA kernel used unsafe ICM access during process creation/modification, which could cause the DPA kernel to hang during FLR.

Keywords: DPA kernel, FLR

Detected in version:

28.47.1026

Fixed in Release: 

28.48.1000

4804664 / 4806969

Description: Fixed an issue in the User Debugger “query caps” where it returned only the number of capabilities, not the capability bitmap.

Keywords: User Debugger “query caps”

Detected in version:

28.47.1026

Fixed in Release:

28.48.1000

4859700 / NVbug 5828711

Description: Fixed an issue caused by a race condition in standby/boot power sequencing. In certain timing windows, port power-down could be delayed such that the power-up flow detected the port still transitioning to power-down, causing the sequence to fail and leaving the port stuck in a powered-down state.

Keywords: PXE boot

Detected in version:

28.47.1026

Fixed in Release: 

28.48.1000

4484662

Description: Fixed an issue where mlxlink reported 0 values for SNR (media and host) due to incorrect local port mapping in firmware and an incorrect page number used by MFT.

Keywords: mlxlink

Detected in version:

28.47.1026

Fixed in Release:

28.48.1000

4744039

Description: Fixed an issue where, due to an SMBus release race condition, the I2C bus could become stuck.

Keywords: SMBus, I2C bus

Detected in version:

28.47.1026

Fixed in Release:

28.48.1000

4773490 / 4823336

Description: Fixed an issue where fuse values were not aligned with the updated values burned across different ConnectX-7 setups.

Keywords: Fuse values

Detected in version:

28.47.1026

Fixed in Release:

28.48.1000

4532684 / 4635872 / 4794865 / 4794866 / 4794867 / NVbug 5385446

Description: Fixed an issue by improving the ADP-RETX algorithm to avoid re-arming without performing a retransmission.

Keywords: ADP-RETX algorithm

Detected in version:

28.47.1026

Fixed in Release:

28.48.1000

4727303 / 4718947

Description: Fixed an issue in the steering definers used for LAG with IPv6 traffic.

Keywords: LAG, IPv6 traffic, steering

Detected in version:

28.47.1026

Fixed in Release:

28.48.1000

4663915

Description: Fixed an issue where a spurious CNP was sent in response to an out-of-sequence packet.

Keywords: PCC, CNP, OOS, RP, NP

Detected in version:

28.47.1026

Fixed in Release:

28.48.1000

4450570 / 4780432 / 4780433

Description: Fixed an issue where the root complex sent MCTP-over-PCI messages before a BDF was assigned, causing responses to be sent with BDF 0. The fix ensures that MCTP messages routed by ID are ignored until a valid BDF is assigned.

Keywords: MCTP-over-PCI, BDF, MCTP messages

Detected in version:

28.47.1026

Fixed in Release: 

28.48.1000

4809134 / 4824635

Description: Fixed an issue where the steering tables were not updated after enabling partial Spectrum-X capabilities (BTH.AR) via LLPD.

Keywords: Steering tables, LLDP

Detected in version:

28.47.1026

Fixed in Release: 

28.48.1000

2169950

Description: When decapsulation on a packet occurs, the FCS indication is not calculated correctly.

Keywords: FCS

Discovered in Version:

28.42.1000

Fixed in Release: 

28.48.1000

3735988


Description: In IB system, RTT_response_sl feature does not work with Sniffer tools (e.g., Wireshark/Tcpdump/).

Keywords: Health buffer, sniffer, RTT

Discovered in Version: 

28.40.1000

Fixed in Release: 

28.48.1000

Internal Ref.

Issue

4608544

Description: Fixed an issue where, in rare live migration scenarios, a delayed doorbell triggered a false timeout alarm.

Keywords: Live migration, doorbell, timeout alarm

Detected in version:

28.46.1006

Fixed in Release: 

28.47.1088

4648642

Description: Fixed a rare issue in which destroying PCC NP configuration objects could result in assert 0x8175 being logged in dmesg.

Keywords: Assert 0x8175, PCC NP

Detected in version:

28.47.1026

Fixed in Release:

28.47.1088

4718947

Description: Fixed an issue in the steering definers used for LAG with IPv6 traffic.

Keywords: LAG, IPv6 traffic, steering

Detected in version:

28.47.1026

Fixed in Release:

28.47.1088

4690503

Description: Fixed an issue where creating a DPA process that uses 128 MB of data caused the dynamic library to fail with syndrome 0xdc30ac. The BSS section of the DPA application is now limited to 64 MB.

Keywords: DPA process, BSS

Detected in version:

28.47.1026

Fixed in Release:

28.47.1088

4683823

Description: Some diagnostic data counters share hardware resources and cannot be configured simultaneously since 64-bit counter formats (e.g., DIAG_DATA_PARAMS_CONTEXT.output_format set to FORMAT_0 or FORMAT_1) consume more hardware resources per counter.

Keywords: DOCA Telemetry Diagnostics

Detected in version:

28.47.1026

Fixed in Release: 

28.47.1088

Internal Ref.

Issue

4570205

Description: Fixed a firmware issue where the ZTR_RTTCC algorithm parameters AI and HAI did not support a sufficient range.

Keywords: PCC, ZTR_RTTCC

Detected in version:

28.46.1006

Fixed in Release:

28.47.1026

4629077

Description: Fixed an issue where coalescing regular SX events with SX RTT events under ZTR_RTTCC could keep improper event fields, which could impact congestion control behavior.

Keywords: PCC, ZTR_RTTCC

Detected in version:

28.46.1006

Fixed in Release: 

28.47.1026

4683328

Description: Fixed an issue in the ZTR_RTTCC algorithm where probe-abortion handling could behave improperly under high-stress network conditions, ensuring proper congestion control and stable traffic performance.

Keywords: PCC, ZTR_RTTCC

Detected in version:

28.46.1006

Fixed in Release:

28.47.1026

4501554

Description: Fixed an assertion failure that could occur with the E-Switch uplink in specific configurations where the e-switch was disabled and Path Migration was active or GVMIs were using SRQ loopback in SQs. The issue occurred because the firmware attempted to perform cleanup operations when the uplink configuration lacked sufficient capacity.
Now, when the E-Switch is disabled and no actions are available in the uplink STE, the firmware connects to the uplink STE instead of copying it.

Keywords: Path migration, steering

Detected in version:

28.46.1006

Fixed in Release:

28.47.1026

4506854

Description: Added Scaling Factor "read" field. To obtain correct values in mlxlink, MFT version 4.33.0 or later is required.

Keywords: Scaling Factor, mlxlink, MFT

Detected in version:

28.46.1006

Fixed in Release:

28.47.1026

4540897

Description: Added a recovery mechanism for I²C failures. In case of an I²C communication failure, the system now automatically attempts to recover and reinitialize the I/O expander to maintain continuous operation.

Keywords: I2C failures, recovery mechanism

Discovered in Version:

28.45.1020

Fixed in Release:

28.47.1026

4560691

Description: Fixed an issue in the MCTP SMBus configuration to ensure proper initialization and reliable communication between firmware components using the SMBus transport.

Keywords: MCTP SMBus configuration

Discovered in Version:

28.45.1020

Fixed in Release:

28.47.1026

4529293

Description: Fixed an issue where, during failover or restart, the SM sending a PortInfo MAD to the HCA firmware triggered reinitialization of port buffers, momentarily halting ingress traffic and causing packet drops.
The firmware now avoids reconfiguring port buffers when the new configuration matches the current one.

Keywords: OpenSM

Discovered in Version:

28.45.1020

Fixed in Release:

28.47.1026

4683346

Description: Fixed an issue where, under the ZTR_RTTCC algorithm, a flow that reached its minimum rate due to heavy congestion would not recover its rate once the congestion cleared.

Keywords: PCC, ZTR_RTTCC

Discovered in Version:

28.46.1006

Fixed in Release:

28.47.1026

4213025

Description: Fixed an issue where destroying or modifying a DPA partition from a non-owner VHCA was incorrectly allowed, such actions are now properly disallowed.

Keywords: VHCA

Discovered in Version:

28.46.1006

Fixed in Release:

28.47.1026

4133425

Description: Fixed an issue where PTP was not supported when the port speed was configured to 1G.

Keywords: PTP

Discovered in Version:

28.46.1006

Fixed in Release:

28.47.1026

Internal Ref.

Issue

4603774

Description: Fixed an issue where the adapter card could drop NC-SI over MCTP commands when padding bytes were present after the NC-SI checksum.

Keywords: NC-SI

Discovered in Version:

28.46.1006

Fixed in Release:

28.46.3048

Internal Ref.

Issue

4501157 / 4257750

Description: Fixed a critical issue with a live firmware patch.

Keywords: Live firmware patch

Discovered in Version:

28.45.1020

Fixed in Release:

28.46.1006

4516394

Description: Fixed an uncleared state caused performance degradation after migration when there were significant differences in resource allocation by ensuring the state is cleared beforehand.

Keywords: Performance

Discovered in Version:

28.45.1020

Fixed in Release:

28.46.1006

4286902

Description: Fixed a race condition in DPA process termination during the exception flow, where a failed process could be missed and not reported to the user.

Keywords: DPA

Discovered in Version:

28.45.1020

Fixed in Release:

28.46.1006

4420567

Description: Removed an unnecessary and partially incorrect firmware check that blocked valid action list permutations allowed by the PRM. Validation of these permutations remains the responsibility of the software.

Keywords: Header actions

Discovered in Version:

28.45.1020

Fixed in Release:

28.46.1006

4443601

Description: Fixed a firmware issue where PXE failed to boot when both LAG ports were up.

Keywords: PXE, LAG

Discovered in Version:

28.45.1020

Fixed in Release:

28.46.1006

4443601

Description: Fixed a firmware issue where PXE failed to boot when both LAG ports were up.

Keywords: PXE, LAG

Discovered in Version:

28.45.1020

Fixed in Release:

28.46.1006

4475307

Description: Fixed an issue where PCC DCQCN used incorrect parameter values when link speed was 400Gbps or higher.

Keywords: PCC DCQCN, congestion control

Discovered in Version:

28.45.1020

Fixed in Release:

28.46.1006

4480427

Description: Fixed incorrect calculation of start address and mode for the CQE buffer in DPA CQ, which could cause CQEs to be written to the wrong address when the buffer is not 4K-aligned and spans a second page boundary.

Keywords: CQ, CQE Buffer, DPA

Discovered in Version:

28.45.1020

Fixed in Release:

28.46.1006

4490103

Description: Fixed the restart timing for the OSFP connector at 400 kHz I2C frequency.

Keywords: Restart timing, OSFP, I2C frequency

Discovered in Version:

28.44.1036

Fixed in Release:

28.46.1006

4416919

Description: Updated Diagnostic Counters interface to prevent the following counters from being cleared after read: pcie_link_latency_total_read_packet and pcie_link_latency_total_read_ns.

Keywords: Diagnostic Counters interface

Discovered in Version:

28.45.1020

Fixed in Release:

28.46.1006

4520774

Description: Fixed an issue preventing adp_retx profile in the ROCE_ACCL access register from being set when there are outstanding QPs on the PF or VF.

Keywords: ROCE_ACCL access register, QPs, PF, VF

Discovered in Version:

28.45.1020

Fixed in Release:

28.46.1006

4403143

Description: Fixed an issue where CREATE_DPA_PROCESS could fail if a DESTROY_DPA_PROCESS (still running during destroy) was executed on a different VHCA. Also addressed a possible failure of CREATE_DPA_PROCESS after FLR.

Keywords: DPA_PROCESS, FLR

Discovered in Version:

28.45.1020

Fixed in Release:

28.46.1006

4388371

Description: Fixed an issue where an uninitialized pport in the SLRG command, when using the SMP interface, caused an assertion failure.

Keywords: SLRG, SMP interface, pport

Discovered in Version:

28.45.1020

Fixed in Release:

28.46.1006

4531558

Description: Fixed inconsistent LED behavior where the LED color for max speed was yellow and green otherwise, contrary to specification, due to a swapped GPIO mapping between control and PHY LEDs in the INI file.

Keywords: LED

Discovered in Version:

28.45.1020

Fixed in Release:

28.46.1006

4470053

Description: Fixed an issue with vQoS parameter configuration to improve latency handling for large messages.

Keywords: vQoS, latency

Discovered in Version:

28.45.1020

Fixed in Release:

28.46.1006

4366117

Description: Configuring a small MTU leads to fragmentation of packets critical for the PXE boot process. As a result, the PXE boot filters mistakenly discard these packets, causing the PXE boot to fail. 

Keywords: PXE boot filters

Detected in version:

28.45.1020

Fixed in Release:

28.46.1006

4475307

Description: Fixed an issue where PCC DCQCN used incorrect parameter values when link speed was 400Gbps or higher.

Keywords: PCC DCQCN, congestion control.

Detected in version:

28.45.1020

Fixed in Release:

28.46.1006

4486431

Description: Fixed an issue where issuing multiple parallel queries of DPA_THREAD objects with the same object ID could fail.

Keywords: DPA

Discovered in Version:

28.45.1020

Fixed in Release:

28.46.1006

4497103

Description: Fixed the setting of the adaptive retransmission profile.

Keywords: Adaptive retransmission profile

Discovered in Version:

28.45.1020

Fixed in Release:

28.46.1006

Last updated: