DOCA SDK Documentation

NVIDIA DOCA DMA Copy Application Guide


This guide provides an example of a DMA Copy implementation on top of NVIDIA® BlueField® DPU.

Introduction

DOCA DMA (direct memory access) Copy application transfers files (data path), up to the maximum supported size by the hardware, between the DPU and the x86 host using the DOCA DMA Library which provides an API to copy data between DOCA buffers using hardware acceleration, supporting both local and remote memory.

DOCA DMA allows complex memory copy operations to be easily executed in an optimized, hardware-accelerated manner.

System Design

DOCA DMA Copy is designed to run on the instances of the BlueField DPU and x86 host. The DPU application must be the first to spawn as it opens the DOCA Comm Channel server between the two sides on which all the necessary DOCA DMA library configuration files (control path) are transferred.

system-design-diagram.png

Application Architecture

DOCA DMA Copy runs on top of DOCA DMA to read/write directly from the host's memory without any user/kernel space context switches, allowing for a fast memory copy.

application-architecture-diagram.png

Flow:

  1. The two sides initiate a short negotiation in which the file size and location are determined.

  2. The host side creates the export descriptor with doca_mmap_export_pci() and sends it with the local buffer address and length on the Comm Channel to the DPU side application.

  3. The DPU side application uses the received export descriptor to create a remote memory map locally with doca_mmap_create_from_export() and the host buffer information to create a remote DOCA buffer.

  4. From this point on, the DPU side application has all the necessary memory information and the DMA copy can take place.

DOCA Libraries

This application leverages the following DOCA libraries:

Refer to their respective programming guide for more information.

Running the Application

Installation

Please refer to the NVIDIA DOCA Installation Guide for Linux for details on how to install BlueField-related software.

Application Execution

The DMA copy application is provided in both source and binary forms. The binary is located under /opt/mellanox/doca/applications/dma_copy/bin/doca_dma_copy.

  1. Application usage instructions: 

    Usage: doca_dma_copy [DOCA Flags] [Program Flags]
    
    DOCA Flags:
      -h, --help                        Print a help synopsis
      -v, --version                     Print program version information
      -l, --log-level                   Set the (numeric) log level for the program <10=DISABLE, 20=CRITICAL, 30=ERROR, 40=WARNING, 50=INFO, 60=DEBUG, 70=TRACE>
      --sdk-log-level                   Set the SDK (numeric) log level for the program <10=DISABLE, 20=CRITICAL, 30=ERROR, 40=WARNING, 50=INFO, 60=DEBUG, 70=TRACE>
      -j, --json <path>                 Parse all command flags from an input json file
    
    Program Flags:
      -f, --file                        Full path to file to be copied/created after a successful DMA copy
      -p, --pci-addr                    DOCA Comm Channel device PCI address
      -r, --rep-pci                     DOCA Comm Channel device representor PCI address (needed only on DPU)
    


    This usage printout can be printed to the command line using the -h (or --help) options:

    /opt/mellanox/doca/applications/dma_copy/bin/doca_dma_copy -h
    




  2. CLI example for running the application on the BlueField:

    /opt/mellanox/doca/applications/dma_copy/bin/doca_dma_copy -p 03:00.0 -r 3b:00.0 -f received.txt
    


    Both the DOCA Comm Channel device PCIe address (03:00.0) and the DOCA Comm Channel device representor PCIe address (3b:00.0) should match the addresses of the desired PCIe devices.


  3. CLI example for running the application on the host:

    /opt/mellanox/doca/applications/dma_copy/bin/doca_dma_copy -p 3b:00.0 -f send.txt
    


    The DOCA Comm Channel device PCIe address, 3b:00.0, should match the address of the desired PCIe device.


  4. The application also supports a JSON-based deployment mode, in which all command-line arguments are provided through a JSON file:

    doca_dma_copy --json [json_file]
    

    For example:

    cd /opt/mellanox/doca/applications/dma_copy/bin
    ./doca_dma_copy --json ./dma_copy_params.json
    


    Before execution, ensure that the used JSON file contains the correct configuration parameters, and especially the PCIe addresses necessary for the deployment.


Command Line Flags

Flag Type

Short Flag

Long Flag/JSON Key

Description

JSON Content

General flags

h

help

Print a help synopsis

N/A

v

version

Print program version information

N/A

l

log-level

Set the log level for the application:

  • DISABLE=10

  • CRITICAL=20

  • ERROR=30

  • WARNING=40

  • INFO=50

  • DEBUG=60

  • TRACE=70 (requires compilation with TRACE log level support)


"log-level": 60


N/A

sdk-log-level

Set the log level for the program:

  • DISABLE=10

  • CRITICAL=20

  • ERROR=30

  • WARNING=40

  • INFO=50

  • DEBUG=60

  • TRACE=70


"sdk-log-level": 40


j

json

Parse all command flags from an input JSON file

N/A

Program flags

f

file

Full path to file to be copied/created after a successful copy


This is a mandatory flag.



"file": "/tmp/sample.txt"


p

pci-addr

DOCA Comm Channel device PCIe address. 

This is a mandatory flag.



"pci-addr": "b1:00.0"


r

rep-pci

DOCA Comm Channel device representor PCIe address.

This is a mandatory flag only on the DPU.



"rep-pci": "b1:02.0"



Refer to DOCA Arg Parser for more information regarding the supported flags and execution modes.

Troubleshooting

Refer to the NVIDIA DOCA Troubleshooting Guide for any issue encountered with the installation or execution of the DOCA applications.

Recompiling the Application

In addition to providing the application in binary form, the installation also includes all of the application sources and compilation instructions so as to allow modifying the sources and recompiling the application. For more information about the applications, as well as development and compilation tips, refer to the DOCA Applications page.

The sources of the application can be found under the /opt/mellanox/doca/applications/dma_copy/src directory.

Recompiling All Applications

The applications are all defined under a single meson project, so the default compilation recompiles all the DOCA applications.

To build all the applications together, run:

cd /opt/mellanox/doca/applications/
meson /tmp/build 
ninja -C /tmp/build


doca_dma_copy is created under /tmp/build/dma_copy/src/.

Recompiling DMA Copy Application Only

To directly build only the DMA Copy application:

cd /opt/mellanox/doca/applications/
meson /tmp/build -Denable_all_applications=false -Denable_dma_copy=true
ninja -C /tmp/build


doca_dma_copy is created under /tmp/build/dma_copy/src/.

Alternatively, one can set the desired flags in the meson_options.txt file instead of providing them in the compilation command line:

  1. Edit the following flags in /opt/mellanox/doca/applications/meson_options.txt:

    • Set enable_all_applications to false

    • Set enable_dma_copy to true

  2. Run the following compilation commands:

    cd /opt/mellanox/doca/applications/
    meson /tmp/build 
    ninja -C /tmp/build
    


    doca_dma_copy is created under /tmp/build/dma_copy/src/.


Troubleshooting

Refer to the NVIDIA DOCA Troubleshooting Guide for any issue encountered with the compilation of the application.

Application Code Flow

  1. Parse application argument. 

    1. Initialize arg parser resources and register DOCA general parameters.

      doca_argp_init();
      


    2. Register DMA Copy application parameters. 

      register_dma_copy_params();
      


    3. Parse the arguments. 

      doca_argp_start();
      


  1. Initialize Comm Channel endpoint. 

    init_cc();
    
    1. Create Comm Channel endpoint.

    2. Parse user PCIe address for Comm Channel device.

    3. Open Comm Channel DOCA device.

    4. Parse user PCIe address for Comm Channel device representor (on DPU side).

    5. Open Comm Channel DOCA device representor (on DPU side).

    6. Set Comm Channel endpoint properties. 

  2. Open the DOCA hardware device from which the copy would be made.

    open_dma_device();
    
    1. Parse the PCIe address provided by the user.

    2. Create a list of all available DOCA devices.

    3. Find the appropriate DOCA device according to specific properties.

    4. Open the device.

  3. Create all required DOCA core objects.

    create_core_objects();
    


  4. Initiate DOCA core objects.

    init_core_objects();
    


  5. Start host/DPU DMA Copy. 

    1. Host side application:

      host_start_dma_copy();
      
      1. Start negotiation with the DPU side application for the location and size of the file.

      2. Allocate memory for the DMA buffer.

      3. Export the memory map and send the output (export descriptor) to the DPU side application.

      4. Send the host local buffer memory address and length on the Comm Channel to the DPU side application.

      5. Wait for the DPU to notify that DMA Copy ended.

      6. Close all memory objects.

      7. Clean resources.

    2. DPU side application:

      dpu_start_dma_copy();
      
      1. Start negotiation with the host side application for file location and size.

      2. Allocate memory for the DMA buffer.

      3. Receive the export descriptor on the Comm Channel.

      4. Create the DOCA memory map for the remote buffer on the host.

      5. Receive the host buffer information on the Comm Channel.

      6. Create two DOCA buffers, one for the remote (host) buffer and one for the local buffer.

      7. Submit the DMA copy task.

      8. Send a host message to notify that DMA copy ended.

      9. Clean resources.

  6. Destroy Comm Channel.

    destroy_cc();
    


  7. Destroy DOCA core objects.

    destroy_core_objects();
    


  8. Arg parser destroy. 

    doca_argp_destroy();
    


References

  • /opt/mellanox/doca/applications/dma_copy/src

  • /opt/mellanox/doca/applications/dma_copy/bin/dma_copy_params.json

Last updated: