NVIDIA BlueField BMC Software

CEC and BMC Firmware Operations


Firmware upgrade of BMC and CEC components using BMC can be performed from a remote server using the Redfish interface.

image-2024-7-3_17-56-25.png

CEC and BMC Firmware Commands

Triggering Secure Firmware Update 

Required for BMC/CEC update.

Triggers the secure update and starts tracking the secure update progress.

  • Update with HttpPushUri API – Deprecated

    HttpPushUri API is obsolete. NVIDIA recommends users migrate to MultipartHttpPushUri API as support for HttpPushUri API will be discontinued in the future.


    curl -k -u root:'<password>' -H "Content-Type: application/octet-stream" -X POST -T <package_path> https://<bmc_ip>/redfish/v1/UpdateService/update
    


  • Multipart update with MultipartHttpPushUri API:

    curl -k -u root:'<password>' https://<bmc_ip>/redfish/v1/UpdateService/update-multipart -F 'UpdateParameters={};type=application/octet-stream' -F UpdateFile=@<package_path>
    


Where:

  • password – password of root user

  • bmc_ip – BMC IP address

  • package_path – firmware update package path

Tracking Secure Firmware Update Progress

Required for BMC/CEC update.


curl -k -u root:'<password>' -X GET https://<bmc_ip>/redfish/v1/TaskService/Tasks

Find the current task ID in the response and use it for checking the progress:

curl -k -u root:'<password>' -X GET https://<bmc_ip>/redfish/v1/TaskService/Tasks/<task_id> | jq -r ' .PercentComplete'

Where:

  • password – password of root user

  • bmc_ip – BMC IP address

  • task_id – Task ID

Resetting/Rebooting BMC

Required for BMC update.


curl -k -u root:'<password>' -H "Content-Type: application/json" -X POST -d '{"ResetType": "GracefulRestart"}' https://<bmc_ip>/redfish/v1/Managers/Bluefield_BMC/Actions/Manager.Reset

Where:

  • password – password of root user

  • bmc_ip – BMC IP address

Fetching Running BMC Firmware Version

Required for BMC update.

Fetches the running firmware version from the BMC.

  • For NVIDIA® BlueField®-3:

    curl -k -u root:'<password>' -X GET https://<bmc_ip>/redfish/v1/UpdateService/FirmwareInventory/BMC_Firmware | jq -r ' .Version'
    

    Where:

    • password – password of root user

    • bmc_ip – BMC IP address

  • For NVIDIA® BlueField®-2:

    curl -k -u root:'<password>' -X GET https://<bmc_ip>/redfish/v1/UpdateService/FirmwareInventory
    

    Fetch the current firmware ID and then perform:

    curl -k -u root:'<password>' -X GET https://<bmc_ip>/redfish/v1/UpdateService/FirmwareInventory/<firmware_id>_BMC_Firmware | jq -r ' .Version'
    

    Where:

    • password – password of root user

    • bmc_ip – BMC IP address

    • firmware_id – numeric value found in the FwInventory schema only. It is calculated during firmware update by the BMC and used to distinguish between the versions.

Fetching Running CEC Firmware Version

Required for CEC update.

Fetches the running firmware version from CEC.

curl -k -u root:'<password>' -X GET https://<bmc_ip>/redfish/v1/UpdateService/FirmwareInventory/Bluefield_FW_ERoT | jq -r ' .Version'

Where:

  • password – password of root user

  • bmc_ip – BMC IP address

ForceUpdate

Relevant for BlueField-3 only.


Required for BMC update.

ForceUpdate forces the update procedure to proceed even if the target version is identical.

  • When using HttpPushUri API, ForceUpdate is always true (i.e., forces the update procedure to proceed even if the target version is identical)

  • When using MultipartHttpPushUri API, ForceUpdate is false (i.e., the update procedure fails if the target version is identical), unless "ForceUpdate":true is included under UpdateParameters:

    curl -k -u root:'<password>' https://<bmc_ip>/redfish/v1/UpdateService/update-multipart -F 'UpdateParameters={"ForceUpdate":true};type=application/octet-stream' -F UpdateFile=@<package_path>
    

    Where:

    • password – password of root user

    • bmc_ip – BMC IP address

Updating BMC

Firmware update takes about 12 minutes.

After initiating the BMC secure update, a response similar to the following is received depending on whether HttpPushUri or MultipartHttpPushUri is used:

  • HttpPushUri API:

    curl -k -u root:'<password>' -H "Content-Type: application/octet-stream" -X POST -T <package_path> https://<bmc_ip>/redfish/v1/UpdateService
    
    {
      "@odata.id": "/redfish/v1/TaskService/Tasks/0",
      "@odata.type": "#Task.v1_4_3.Task",
      "Id": "<id>",
      "TaskState": "Running"
    }
    


  • MultipartHttpPushUri API:

    curl -k -u root:'<password>' https://<bmc_ip>/redfish/v1/UpdateService/update-multipart -F 'UpdateParameters={};type=application/octet-stream' -F UpdateFile=@<package_path> 
    {
      "@odata.id": "/redfish/v1/TaskService/Tasks/0",
      "@odata.type": "#Task.v1_4_3.Task",
      "Id": "<id>",
      "TaskState": "Running",
      "TaskStatus": "OK"
    }
    

    Where: 

    • package_path – the BMC firmware image file including its path

      For example: 

      curl -k -u root:'myP@ssword_12345!' https://10.10.10.10/redfish/v1/UpdateService/update-multipart -F 'UpdateParameters={};type=application/octet-stream' -F UpdateFile=@/root/bf2-bmc-ota-24.01-4-ipn.tar 
      


The following command is used to track secure firmware update progress:

curl -k -u root:'<password>' -X GET https://<bmc_ip>/redfish/v1/TaskService/Tasks/<id> | jq -r ' .PercentComplete'

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current Dload  Upload   Total   Spent    Left  Speed
100  2123  100  2123    0     0  38600      0 --:--:-- --:--:-- --:--:-- 37910
20

The task has completed when PercentComplete reaches 100.

Since the reboot option is disabled during the update procedure, the following command is used to reboot the BMC:

curl -k -u root:'<password>' -X GET https://<bmc_ip>/redfish/v1/TaskService/Tasks/<id> | jq -r ' .PercentComplete'
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current Dload  Upload   Total   Spent    Left  Speed
100  3822  100  3822    0     0  81319      0 --:--:-- --:--:-- --:--:-- 81319
100

curl -k -u root:'<password>' -H "Content-Type: application/octet-stream" -X POST -d '{"ResetType": "GracefulRestart"}' https://<bmc_ip>/redfish/v1/Managers/Bluefield_BMC/Actions/Manager.Reset
{
  "@Message.ExtendedInfo": [
    {
      "@odata.type": "#Message.v1_1_1.Message",
      "Message": "The request completed successfully.",
      "MessageArgs": [],
      "MessageId": "Base.1.13.0.Success",
      "MessageSeverity": "OK",
      "Resolution": "None"
    }
  ]
}

The following commands are used to verify the current BMC firmware version after reboot:

  • For BlueField-3:

    curl -k -u root:'<password>' -X GET https://<bmc_ip>/redfish/v1/UpdateService/FirmwareInventory/BMC_Firmware | jq -r ' .Version'
    
      % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current Dload  Upload   Total   Spent    Left  Speed
    100   513  100   513    0     0   9679      0 --:--:-- --:--:-- --:--:--  9679
    


  • For BlueField-2:Fetch the firmware ID from FirmwareInventory: curl -k -u root:'<password>' -X GET https:/<bmc_ip>/redfish/v1/UpdateService/FirmwareInventory/ { "@odata.id": "/redfish/v1/UpdateService/FirmwareInventory", "@odata.type": "#SoftwareInventoryCollection.SoftwareInventoryCollection", "Members": [ { "@odata.id": "/redfish/v1/UpdateService/FirmwareInventory/8c8549f3_BMC_Firmware" … Use the following command with the fetched firmware ID from the previous step: curl -k -u root:'<password>' -X GET https:/<bmc_ip>/redfish/v1/UpdateService/FirmwareInventory/8c8549f3_BMC_Firmware | jq -r ' .Version' % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 471 100 471 0 0 622 0 --:--:-- --:--:-- --:--:-- 621 bmc-23.04

For BlueField-3 BMC only, when updating to an identical version using MultipartHttpPushUri, it is necessary to include ForceUpdate=true:

curl -k -u root:'<password>' https://<bmc_ip>/redfish/v1/UpdateService/update-multipart -F 'UpdateParameters={"ForceUpdate":true};type=application/octet-stream' -F UpdateFile=@<package_path>

Otherwise, the update procedure fails with the message Component image is identical.

Updating CEC

Firmware update takes about 20 seconds.

After initiating the BMC secure update, a response similar to the following is received depending on whether HttpPushUri or MultipartHttpPushUri is used:

  • HttpPushUri API:

    curl -k -u root:'<password>' -H "Content-Type: application/octet-stream" -X POST -T <package_path> https://<bmc_ip>/redfish/v1/UpdateService
    {
      "@odata.id": "/redfish/v1/TaskService/Tasks/0",
      "@odata.type": "#Task.v1_4_3.Task",
      "Id": "0",
      "TaskState": "Running"
    


  • MultipartHttpPushUri API:

    curl -k -u root:'<password>' https://<bmc_ip>/redfish/v1/UpdateService/update-multipart -F 'UpdateParameters={};type=application/octet-stream' -F UpdateFile=@<package_path>
    {
      "@odata.id": "/redfish/v1/TaskService/Tasks/0",
      "@odata.type": "#Task.v1_4_3.Task",
      "Id": "0",
      "TaskState": "Running",
      "TaskStatus": "OK"
    }
    

    Where:

    • package_path – the BMC firmware image file including its path 

      For example: 

      curl -k -u root:'myP@ssword_12345!' https://10.10.10.10/redfish/v1/UpdateService/update-multipart -F 'UpdateParameters={};type=application/octet-stream' -F UpdateFile=@/root/cec_ota_BMGP-04.0f_debug.bin
      


The following command is used to track the progress of the CEC firmware update:

curl -k -u root:'<password>' -X GET https://<bmc_ip>/redfish/v1/TaskService/Tasks/0 | jq -r ' .PercentComplete'
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current Dload  Upload   Total   Spent    Left  Speed
100  2123  100  2123    0     0  38600      0 --:--:-- --:--:-- --:--:-- 37910
100


After the CEC secure update operation is complete, a CEC activation and reset should be done (see "CEC and BMC Firmware Operations | id (25.01)CECandBMCFirmwareOperations ActivatingNewCEC") to apply the changes once the update is finished.

The following command is used to verify the current CEC firmware version after reboot:

curl -k -u root:'<password>' -X GET https://<bmc_ip>/redfish/v1/UpdateService/FirmwareInventory/Bluefield_FW_ERoT | jq -r ' .Version'

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   421  100   421    0     0   1172      0 --:--:-- --:--:-- --:--:--  1172
19-4

Activating New CEC

This is relevant only for BlueField-3 networking platforms (DPU or SuperNIC).

To activate the new CEC firmware, it is necessary to reset the CEC device. The possible options are explained in the following subsections.

Resetting CEC and BMC Subsystems Using CEC Self-reset Command

The CEC self-reset command is supported in CEC version 00.02.0180.0000 and above. The command activates the new firmware on CEC and should only be used after the CEC update procedure is complete.

Redfish Command

curl -k -u root:'<password>' -H "Content-Type: application/json" -X POST -d '{"ResetType": "GracefulRestart"}' https://<bmc_ip>/redfish/v1/Chassis/Bluefield_ERoT/Actions/Chassis.Reset

Where:

  • password – password of root user

  • bmc_ip – BMC IP address

Command response options:

  • The response in case of success:

    {
      "@Message.ExtendedInfo": [
        {
          "@odata.type": "#Message.v1_1_1.Message",
          "Message": "The request completed successfully.",
          "MessageArgs": [],
          "MessageId": "Base.1.15.0.Success",
          "MessageSeverity": "OK",
          "Resolution": "None"
        }
      ]
    } 
    


  • The response if there is no pending CEC firmware:

    curl -k -u root:'<password>' -H "Content-Type: application/json" -X POST https://<bmc_ip>/redfish/v1/Chassis/Bluefield_ERoT/Actions/Chassis.Reset -d '{"ResetType":"GracefulRestart"}'
    {
      "error": {
        "@Message.ExtendedInfo": [
          {
            "@odata.type": "#Message.v1_1_1.Message",
            "Message": "The requested resource of type ERoT FW named 'Pending-ERoT-FW' was not found.",
            "MessageArgs": [
              "ERoT FW",
              "Pending-ERoT-FW"
            ],
            "MessageId": "Base.1.15.0.ResourceNotFound",
            "MessageSeverity": "Critical",
            "Resolution": "Provide a valid resource identifier and resubmit the request."
          }
        ],
        "code": "Base.1.15.0.ResourceNotFound",
        "message": "The requested resource of type ERoT FW named 'Pending-ERoT-FW' was not found."
      }
    }
    


  • The response if the command is not supported by the current CEC firmware:

    curl -k -u root:'<password>' -H "Content-Type: application/json" -X POST https://<bmc_ip>/redfish/v1/Chassis/Bluefield_ERoT/Actions/Chassis.Reset -d '{"ResetType":"GracefulRestart"}'
    {
      "error": {
        "@Message.ExtendedInfo": [
          {
            "@odata.type": "#Message.v1_1_1.Message",
            "Message": "The action ERoT self-reset is not supported by the resource.",
            "MessageArgs": [
              "ERoT self-reset"
            ],
            "MessageId": "Base.1.15.0.ActionNotSupported",
            "MessageSeverity": "Critical",
            "Resolution": "The action supplied cannot be resubmitted to the implementation.  Perhaps the action was invalid, the wrong resource was the target or the implementation documentation may be of assistance."
          }
        ],
        "code": "Base.1.15.0.ActionNotSupported",
        "message": "The action ERoT self-reset is not supported by the resource."
      }
    }
    


IPMI Command

ipmitool raw 0x32 0xD2

Command response options:

  • If successful, no response is given. Glacier and BMC reset.

  • The response if there is no pending CEC firmware:

    Unable to send RAW command (channel=0x0 netfn=0x32 lun=0x0 cmd=0xd2 rsp=0xd6): Cannot execute command, command disabled
    


  • The response if the command is not supported by the current CEC firmware:

    Unable to send RAW command (channel=0x0 netfn=0x32 lun=0x0 cmd=0xd2 rsp=0xd5): Command not supported in present state
    


Log Event Entries Created per Response Type

  • Log event entry created in case of success:

    {
      "@odata.id": "/redfish/v1/Systems/Bluefield/LogServices/EventLog/Entries/<Id>",
      "@odata.type": "#LogEntry.v1_13_0.LogEntry",
      "AdditionalDataURI": "/redfish/v1/Systems/Bluefield/LogServices/EventLog/Entries/<Id>/attachment",
      "Created": "<Date>",
      "EntryType": "Event",
      "Id": "<Id>",
      "Message": "The request completed successfully.",
      "MessageArgs": [
        ""
      ],
      "MessageId": "Base.1.0.Success",
      "Name": "System Event Log Entry",
      "Resolution": "Ready for ERoT self Reset",
      "Resolved": false,
      "Severity": "OK"
    }
    


  • Log event entry created if there is no pending CEC firmware:

    curl -k -u root:'<password>' -H "Content-Type: application/json" -X GET https://<bmc_ip>/redfish/v1/Systems/Bluefield/LogServices/EventLog/Entries/11
    {
      "@odata.id": "/redfish/v1/Systems/Bluefield/LogServices/EventLog/Entries/<Id>",
      "@odata.type": "#LogEntry.v1_13_0.LogEntry",
      "AdditionalDataURI": "/redfish/v1/Systems/Bluefield/LogServices/EventLog/Entries/<Id>/attachment",
      "Created": "<Date>",
      "EntryType": "Event",
      "Id": "<Id>",
      "Message": "Awaiting for an action to proceed with installing image '' on 'ERoT'.",
      "MessageArgs": [
        "",
        "ERoT"
      ],
      "MessageId": "Update.1.0.AwaitToUpdate",
      "Name": "System Event Log Entry",
      "Resolution": "Cannot perform ERoT self reset: There is no EC FW pending. Perform an ERoT FW update.",
      "Resolved": false,
      "Severity": "OK"
    }
    


  • Log event entry created if the command is not supported by the current CEC firmware:

    {
      "@odata.id": "/redfish/v1/Systems/Bluefield/LogServices/EventLog/Entries/<Id>",
      "@odata.type": "#LogEntry.v1_13_0.LogEntry",
      "AdditionalDataURI": "/redfish/v1/Systems/Bluefield/LogServices/EventLog/Entries/<Id>/attachment",
      "Created": "<Date>",
      "EntryType": "Event",
      "Id": "<Id>",
      "Message": "The action ERoT self Reset is not supported by the resource.",
      "MessageArgs": [
        "ERoT self Reset"
      ],
      "MessageId": "Base.1.0.ActionNotSupported",
      "Name": "System Event Log Entry",
      "Resolution": "Cannot perform ERoT self reset: The action is not supported by the current ERoT version.",
      "Resolved": false,
      "Severity": "OK"
    }
    


  • Possible log event entries created if an error occurs during the BMC shut down procedure (received after a success log event entry):

    {
      "@odata.id": "/redfish/v1/Systems/Bluefield/LogServices/EventLog/Entries/<Id>",
    ...
      "MessageId": "Base.1.0.InternalError",
      "Name": "System Event Log Entry",
      "Resolution": "ERoT Reset: Failed to close services towards ERoT reset",
      "Resolved": false,
      "Severity": "Critical"
    }
    


    {
      "@odata.id": "/redfish/v1/Systems/Bluefield/LogServices/EventLog/Entries/<Id>",
    ...
      "MessageId": "Base.1.0.InternalError",
      "Name": "System Event Log Entry",
      "Resolution": "ERoT Reset: Isolate operation may still be in progress.",
      "Resolved": false,
      "Severity": "Critical"
    }
    


    {
      "@odata.id": "/redfish/v1/Systems/Bluefield/LogServices/EventLog/Entries/<Id>",
    ...
      "MessageId": "Base.1.0.InternalError",
      "Name": "System Event Log Entry",
      "Resolution": "ERoT Reset: Failed to unmount file system towards ERoT reset.",
      "Resolved": false,
      "Severity": "Critical"
    }
    


    {
      "@odata.id": "/redfish/v1/Systems/Bluefield/LogServices/EventLog/Entries/<Id>",
    ...
      "MessageId": "Base.1.0.InternalError",
      "Name": "System Event Log Entry",
      "Resolution": ERoT Reset: Failed to unmount file system. It is still mounted as read-write (rw),
      "Resolved": false,
      "Severity": "Critical"
    }
    


Resetting CEC and BMC Subsystems Using IPMI I2C Command over SMBus Channel Connected to PCIe Golden Finger

This option is valid only for servers which support I2C over SMBus from the host BMC.


ipmitool raw 0x06 0x52 <BUS-ID> 0x82 0x00 0x03 0xFE
ipmitool raw 0x06 0x52 <BUS-ID> 0x82 0x00 0x01 0xFE
sleep <100ms>
ipmitool raw 0x06 0x52 <BUS-ID> 0x82 0x00 0x01 0xFF
ipmitool raw 0x06 0x52 <BUS-ID> 0x82 0x00 0x03 0xFF


The BUS-ID value is system related. It relays how the host BMC is connected to the SMBus of the related BlueField.


The format of the ipmitool i2c command is as follows:

ipmitool raw <netfun> <cmd> <bus-id> <addr> <read-count> <write-data1> <write-data2>


Resetting Entire BlueField

This option typically involves a full power cycle of the host platform.

CEC Background Update Status

This section is relevant only for BlueField-3.

BMC and CEC have an active and inactive copy of the same firmware image on their respective firmware SPI flash. The firmware update is performed on the inactive copy, and on a successful boot from the newly updated and active image, the inactive image (e.g., the previous active image) is updated with the latest image.

Firmware update cannot be initiated if the background copy is in progress.

To check the status of the background update:

curl -k -u root:'<password>' -X GET https://<bmc_ip>/redfish/v1/Chassis/Bluefield_ERoT 
... 
  "Oem": { 
    "Nvidia": { 
      "@odata.type": "#NvidiaChassis.v1_0_0.NvidiaChassis", 
      "AutomaticBackgroundCopyEnabled": true, 
      "BackgroundCopyStatus": "Completed", 
      "InbandUpdatePolicyEnabled": true 
    } 
  } 
… 


The background update initially indicates InProgress while the inactive copy of the image is being updated.

Possible Error Codes

This section is relevant only for BlueField-3.


Fault

Diagnosis and Possible Solution

Connection to BMC breaks during firmware package transfer

  • Redfish task URI is not returned by the Redfish server

  • The Redfish server (if operational) is in idle state

  • After a reboot of BMC, or restart/recovery of the Redfish server, the Redfish server is in idle state

A new firmware update can be attempted by the Redfish client.

Connection to BMC breaks during firmware update

  • Redfish task URI previously returned by the Redfish server is no longer accessible

  • The Redfish server (if operational) is in one of the following states:In idle state, if the firmware update has completedIn update state, if the firmware update is still ongoing

  • After a BMC reboot, or the restart/recovery of the Redfish server, the Redfish server is in idle state

A new firmware update can be attempted by the Redfish client.

Two firmware update requests are initiated

The Redfish server blocks the second firmware update request and returns the following:

  • HTTP code 400 "Bad Request"

  • Redfish message based on standard registry entry UpdateInProgress

  • A resolution is proposed: "Another update is in progress. Retry the update operation once it is complete."

Check the status of the ongoing firmware update by looking at the TaskCollection resource.

Redfish task hangs

  • Redfish task URI that previously returned by the Redfish server is no longer accessible

  • PLDM-based firmware update progresses

  • After a reboot of BMC, or restart/recovery of the Redfish server, the Redfish server us in idle state

A new firmware update can be attempted by the Redfish client.

BMC-EROT communication failure during image transfer

The Redfish task monitoring the firmware update indicates a failure:

  • TaskState is set to Exception

  • TaskStatus is set to Warning

  • Messages array in the task includes an entry based on the standard registry Update.1.0.0.TransferFailed indicating the components that failed during image transfer

The Redfish client may retry the firmware update.

Firmware update fails

The Redfish task monitoring the firmware update indicates a failure:

  • TaskState is set to Exception

  • TaskStatus is set to Warning

  • Messages array in the task includes an entry describing the error

The Redfish client may retry the firmware update.

ERoT failure (not responding)

The Redfish task monitoring the firmware update indicates a failure:

  • TaskState is set to Canceled

  • TaskStatus is set to Warning

  • Messages array in the task includes an entry describing the error

  • The Redfish client reports the error

The Redfish client may retry the firmware update.

Firmware image validation failure

The Redfish task monitoring the firmware update indicates a failure:

  • TaskState is set to Exception

  • TaskStatus is set to Warning

  • Messages array in the task includes an entry based on the standard registry Update.1.0.0.VerificationFailed to indicate the component for which verification failed

  • The Redfish client reports the error

The Redfish client might retry the firmware update.

Power loss before activation command is sent

  • The Redfish server is in idle state

A new firmware update can be attempted by the Redfish client.

Firmware activation failure

The Redfish task monitoring the firmware update indicates a failure:

  • TaskState is set to Exception

  • TaskStatus is set to Warning

  • Messages array in the task includes an entry based on the standard registry Update.1.0.ActivateFailed

The Redfish client may retry the firmware update.

Push to BMC firmware package greater than 200 MB

  • No Redfish task is created

  • Messages array in the task includes an entry based on the standard registry
    Base.1.15.0.PayloadTooLarge and the Resolution "Firmware package size is greater than allowed size". Make sure the package size is less than the UpdateService.MaxImageSizeBytes property and retry the firmware update operation.


Last updated: