DOCA Framework Bug Fixes
|
Ref # |
Issue |
|---|---|
|
4469496 |
Description: On some environments, when an application deletes all flow rules from a template table and then attempted to read flow rules to the same table, an error with |
|
Keyword: Flow rules |
|
|
Detected in version: 2.9.2 |
|
|
4391384 |
Description: The UCX package is built without GDR copy support due to an unintended change in the build system that excluded CUDA. As a result, applications relying on GDR functionality in UCX are unable to use it. |
|
Keyword: GDR support; CUDA missing; build environment |
|
|
Detected in version: 2.9.2 |
|
|
4403063 |
Description: When the package |
|
Keyword: MFT; firmware |
|
|
Detected in version: 2.9.2 |
|
|
4259675 |
Description: In rare cases, systems using shared receive queues (shared_rxq) may experience incorrect packet handling during high-throughput traffic. |
|
Keyword: Shared RXQ; packet corruption; routing error |
|
|
Detected in version: 2.9.2 |
|
|
4410028 |
Description: On SLES 15 SP5 with kernel version 5.14.21-150500.55.68-default or later, installation of mlnx-ofa_kernel drivers fails to use weak-modules, causing the system to fall back to inbox OFED modules. This occurs because the kernel used to build the drivers (5.14.21-150500.53-default) did not include the mana_ib driver, while newer kernels do—triggering a weak-modules sanity check failure due to the missing replacement. |
|
Keyword: Weak modules; Kernel version mismatch; inbox driver conflict |
|
|
Detected in version: 2.9.0 |
BSP Bug Fixes
|
Ref # |
Issue Description |
|---|---|
|
4403055 |
Description: Repeated power cycles cause corruption in the EXT4 file system. |
|
Keywords: Power cycle; FS corruption |
|
|
Reported in version: |
BMC Bug Fixes
|
Ref # |
Issue Details |
|---|---|
|
4944048 |
Description: When upgrading or downgrading between the 25.10-LTSU2 and 26.04 releases, repeated BMC reboots may, in rare cases, cause the |
|
Workaround: Perform a factory reset on the BMC. |
|
|
Keyword: BMC reboot; core dump; factory reset |
|
|
Reported in version: 25.10-LTSU2 |
|
|
4917779 |
Description: Initiating an Arm |
|
Reported in version: 26.01 |
|
|
4948318 4945554 |
Description: If a secondary BMC task (such as a log dump) is started after the BMC firmware update has been initiated, but before the installer's monitoring logic has attached to it, the installer may mistakenly track the secondary task. This tracking error causes the installer to misjudge the update's completion, which can cause the subsequent BMC reboot to fail and leave the new firmware in a pending, unactivated state. |
|
Reported in version: 26.01 |
|
|
4401488 |
Description: The BMC kernel enforces |
|
Reported in version: 26.01 |
|
|
4905017 |
Description: When operating in NIC mode, a host power cycle may intermittently cause the UEFI to fail to retrieve BMC Redfish credentials. This results in a |
|
Reported in version: 26.01 |
|
|
4969243 |
Description: When the |
|
Reported in version: 26.01 |
|
|
4995032 |
Description: Redfish queries via |
|
Reported in version: 26.01 |
|
|
4867786 |
Description: During BFB installation, the Golden ARM image update may intermittently hang and fail via Redfish, logging a |
|
Reported in version: 26.01 |
|
|
4914053 |
Description: The BFB installer defaults to DHCP for the VLAN4040 interface. If no DHCP server is present, the request silently fails after a 300-second timeout, bypassing the static IP fallback and skipping all BMC-related firmware updates. |
|
Reported in version: 26.01 |
|
|
4924426 |
Description: Following a DPU reset, the |
|
Reported in version: 26.01 |
|
|
4987307 |
Description: During BFB installations via Redfish, the task state may change to "Exception" before the specific error message is appended to the HTTP response payload. This results in incomplete error logs on the initial poll following a failure. |
|
Reported in version: 26.01 |
|
|
4980118 |
Description: The |
|
Reported in version: 26.01 |
|
|
4799519 |
Description: Accessing the |
|
Reported in version: 26.01 |
|
|
4932328 |
Description: Excessive Common Platform Error Record (CPER) files in |
|
Reported in version: 26.01 |
|
|
4957197 |
Description: When external monitoring tools or scripts repeatedly query the BMC's Redfish interface using Basic authentication over extended periods, internal session resources fail to release properly. This memory leak eventually causes the BMC to lose network connectivity, even while the DPU management interface remains online. |
|
Reported in version: 26.01 |
|
|
4966472 |
Description: The BMC generates a warning log for PLDM_Sensor_1_100 when the NIC temperature reaches the official 91°C upper non-critical threshold. This is an expected hardware alert for elevated temperatures, not a software defect. |
|
Reported in version: 26.01 |
BlueField-3 Firmware Bug Fixes
|
Internal Ref. |
Issue |
|---|---|
|
4422979 |
Description: Fixed a rare case causing PCIe failure after a power cycle. |
|
Keywords: PCIe |
|
|
Discovered in Version: 32.43.2566 |
|
|
Fixed in Release: 32.43.3608 |
|
|
4388371 |
Description: Fixed an issue where an uninitialized pport in the SLRG command, when using the SMP interface, caused an assertion failure. |
|
Keywords: SLRG, SMP interface, pport |
|
|
Discovered in Version: 32.43.2566 |
|
|
Fixed in Release: 32.43.3608 |
|
|
4486422 |
Description: Fixed an issue where PCIe errors from the endpoint were incorrectly reported to RAS even when they were not reported to the host, ensuring compliance with PCIe spec 6.2.5 (Sequence of Device Error Signaling and Logging Operations). |
|
Keywords: PCIe |
|
|
Discovered in Version: 32.43.2566 |
|
|
Fixed in Release: 32.43.3608 |
|
|
4364539 |
Description: Fixed a race condition in which issuing a reset command to the NIC while the flash is in suspend mode caused the NIC to reboot without recognizing that the flash was still suspended. |
|
Keywords: PRS |
|
|
Discovered in Version: 32.43.2566 |
|
|
Fixed in Release: 32.43.3608 |
|
|
4444987 |
Description: Removed from the relevant PRS the incorrect INI configuration that skipped receiver detection. |
|
Keywords: PRS |
|
|
Discovered in Version: 32.43.2566 |
|
|
Fixed in Release: 32.43.3608 |
|
|
4427796 |
Description: Enabled MCTP communication with the DPU BMC on SKUs: 900-9D3C6-00SV-DA0 and 900-9D3C6-B9SV-DA0. |
|
Keywords: MCTP communication, DPU BMC |
|
|
Discovered in Version: 32.43.2566 |
|
|
Fixed in Release: 32.43.3608 |
|
|
4438736 |
Description: Fixed a race condition between firmware and hardware flows during QP closure and a potential endless loop. |
|
Keywords: Race; endless loop |
|
|
Discovered in Version: 32.43.2566 |
|
|
Fixed in Release: 32.43.3608 |
|
|
4470567 |
Description: Modified the VQoS parameter configuration to improve latency for large messages. |
|
Keywords: VQoS, latency improvement |
|
|
Discovered in Version: 32.43.2566 |
|
|
Fixed in Release: 32.43.3608 |
|
|
4443919 / 4395036 |
Description: Fixed a race condition between firmware and hardware flows during QP closure. |
|
Keywords: Race condition |
|
|
Discovered in Version: 32.43.2566 |
|
|
Fixed in Release: 32.43.3608 |
|
|
4355566 |
Description: Fixed high latency observed in IB_READ_LATACNY when eswitch scheduling is enabled and rate limit is set. |
|
Keywords: Data latency |
|
|
Discovered in Version: 32.43.2566 |
|
|
Fixed in Release: 32.43.3608 |
|
|
4444874 |
Description: Fixed an issue where the firmware failed to de-assert the PERST signal of the DSP on pcore1. The fix involved correctly checking the output of the default GPIO mapping against 0xFFF (NO_GPIO_FUNCTION) instead of 0xFF (INVALID_READ). |
|
Keywords: PERST signal |
|
|
Discovered in Version: 32.43.2566 |
|
|
Fixed in Release: 32.43.3608 |
|
|
4234972 |
Description: Fixed an issue where the isr_distributer, responsible for distributing tokens to SQs, was not being triggered reliably every 100 µs. Its priority has been elevated to HIGH, and it is now marked as 'busy' upon completion to ensure consistent and timely execution. |
|
Keywords: VQoS |
|
|
Discovered in Version: 32.43.2566 |
|
|
Fixed in Release: 32.43.3608 |
|
|
4384412 |
Description: Fixed an issue where the firmware could send an incorrect object_id in the device emulation object change event, causing the virtio-net controller to fail in handling operations on the host's virtio device. This typically occurred after a software live upgrade when many events were triggered simultaneously—such as unbinding drivers on VFs in parallel—and could result in a host hang. |
|
Keywords: Device emulation object change event |
|
|
Discovered in Version: 32.43.2566 |
|
|
Fixed in Release: 32.43.3608 |
|
|
4344710 |
Description: The enabled by default MSB bit in pkg_id has been removed from the strap. pkg_id now correctly supports values in the range 0 to 3. |
|
Keywords: NC-SI package ID |
|
|
Discovered in Version: 32.43.2566 |
|
|
Fixed in Release: 32.43.3608 |
|
|
4330201 |
Description: Fixed an issue that prevented the OS from booting due to UEFI PCI enumeration. |
|
Keywords: Booting |
|
|
Discovered in Version: 32.43.2566 |
|
|
Fixed in Release: 32.43.3608 |
|
|
4283167 |
Description: Fixed an issue in the VQoS algorithm related to learning when an element is active and when it begins sending traffic. |
|
Keywords: VQoS algorithm |
|
|
Discovered in Version: 32.43.2566 |
|
|
Fixed in Release: 32.43.3608 |
|
|
4283168 |
Description: Resolved higher latency issue when enabling VF group rate limiter (ESW scheduling). |
|
Keywords: Rate limiter |
|
|
Discovered in Version: 32.43.2566 |
|
|
Fixed in Release: 32.43.3608 |
|
|
4361277 |
Description: Fixed an issue in the ZTR_RTTCC algorithm when using SOURCE_QP (ROCE_CC_SHAPER_COALESCE in mlxconfig) in LAG mode, which caused low bandwidth in many-to-one traffic scenarios. |
|
Keywords: LAG, PCC, ZTR_RTTCC |
|
|
Discovered in Version: 32.43.2566 |
|
|
Fixed in Release: 32.43.3608 |
|
|
4403151 |
Description: Fixed an issue that caused reduced bandwidth during the initial traffic phase when the lossy ADP retransmission feature was enabled alongside the DCQCN congestion control algorithm, due to a low ACK timeout making ADP retransmissions overly aggressive. |
|
Keywords: Lossy ADP retransmission, Congestion Control |
|
|
Discovered in Version: 32.43.2566 |
|
|
Fixed in Release: 32.43.3608 |
|
|
4444306 |
Description: Fixed an issue where transitioning a QP attached to an XRQ to the error state using the 2ERR command could lead to request conflicts. The firmware now properly waits for all in-flight requests to complete before issuing a new event, ensuring the software can safely proceed with initializing a new QP. |
|
Keywords: NVMe-oF Target Offload |
|
|
Discovered in Version: 32.43.2566 |
|
|
Fixed in Release: 32.43.3608 |
|
|
4336970 |
Description: Reduced the bandwidth fluctuation induced by VQoS rate limiting in systems with bellow 350 QPs. This change is enabled by default. |
|
Keywords: VQoS |
|
|
Discovered in Version: 32.43.2566 |
|
|
Fixed in Release: 32.43.3608 |
|
|
4336965 |
Description: Adjusted the RX lossless buffer default parameters to delay transmission of Pause/PFC frames when the NIC is congested. Rx lossless buffer parameters will now be enabled by default. |
|
Keywords: RX lossless buffer size |
|
|
Discovered in Version: 32.43.2566 |
|
|
Fixed in Release: 32.43.3608 |
|
|
4361179 |
Description: Fixed an issue that caused bandwidth to drop when unbinding multiple VFs with VQoS enabled. |
|
Keywords: VQoS |
|
|
Discovered in Version: 32.43.2566 |
|
|
Fixed in Release: 32.43.3608 |
BlueField-2 Firmware Bug Fixes
|
Internal Ref. |
Issue |
|---|---|
|
4342749 |
Description: Fixed an issue where, if the summary queue size on initiators exceeds the SRQ size on the NVMe-oF target, RNR NACKs are triggered. The Congestion Control (CC) mechanism significantly reduces the rate in response to the presence of RNR, leading to a substantial drop in bandwidth during NVMe WRITE operations and mixed tests. |
|
Keywords: NVMe-oF target, RNR NACKs, Congestion Control (CC) |
|
|
Discovered in Version: 24.43.2566 |
|
|
Fixed in Release: 24.43.3608 |
|
|
4358188 |
Description: Fixed an issue where enabling DIM could lead to high IRQ/s in certain scenarios. |
|
Keywords: vDPA, DIM |
|
|
Discovered in Version: 24.43.2566 |
|
|
Fixed in Release: 24.43.3608 |
|
|
4355566 |
Description: Fixed high latency observed in IB_READ_LATACNY when eswitch scheduling is enabled and rate limit is set. |
|
Keywords: Data latency |
|
|
Discovered in Version: 24.43.2566 |
|
|
Fixed in Release: 24.43.3608 |
Last updated: