DOCA SDK Documentation

DOCA Pipeline Language Runtime Controller Gateway SHM Application Guide

This document describes the usage of the NVIDIA DOCA Pipeline Language (DPL) Runtime Controller Gateway SHM sample application.

Introduction

This sample application leverages the capabilities of DOCA Pipeline Languages (DPL) Services combined with the DPL Runtime Controller SDK to implement a VXLAN gateway application . It provides encapsulation/decapsulation logic and interactive packet/counter management.

System Design

The following diagram illustrates the connections between the DPL Runtime Controller-based application and the DPL Runtime Service (DPL daemon). The DPL daemon is loaded with a DPLgateway_shm.p4 program that implements the gateway logic.

The controller communicates with the DPL daemon through gRPC or Shared Memory (SHM) interfaces to insert, query, and delete HW steering rules. Traffic arriving from the wire or the VFs is processed according to the gateway program and passed to the relevant port based on the loaded program.

image-2025-10-19_17-31-15.png

Architecture

The application is based on two major components that work together to define and manage the forwarding state:

DPL Program (gateway_shm.p4)

A DPL program loaded onto the DPL Runtime Service:

/*
 * SPDX-FileCopyrightText: Copyright (c) 2024-2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
 * SPDX-License-Identifier: LicenseRef-NvidiaProprietary
 *
 * NVIDIA CORPORATION, its affiliates and licensors retain all intellectual
 * property and proprietary rights in and to this material, related
 * documentation and any modifications thereto. Any use, reproduction,
 * disclosure or distribution of this material and related documentation
 * without an express license agreement from NVIDIA CORPORATION or
 * its affiliates is strictly prohibited.
 */

#include <doca_model.p4>
#include <doca_headers.p4>
#include <doca_externs.p4>
#include <doca_parser.p4>

/*
 * VxLAN tunnel gateway
 * This application allows the user to have a customized tunnel gateway that can stitch VxLAN
 * packets across tenant domains. End point traffic destined to local bare metal hosts can 
 * be decapsulated and forwarded, while gateway traffic can be decapsulated, then
 * encapsulated back to the wire. This program can be easily extended to be a gateway across
 * different tunnel types as well.
 */

/*
* Table sizes.
*/
const bit<32> DECAP_TABLE_SIZE = 32768;
const bit<32> ENCAP_TABLE_SIZE = 32768;

/* The directionality is based on network to host
 * The user will configure the DPL port IDs in the DPL RT configuration
 */
const bit<32> WIRE_PORT = 32w0;

struct headers_t {
    NV_FIXED_HEADERS
}

parser packet_parser(packet_in packet, out headers_t headers) {
    NV_FIXED_PARSER(packet, headers)
 }

/**
* This control performs the overlay policy including L2 encap with VxLAN
*/
control overlay_encap(
    inout headers_t headers,
    in nv_standard_metadata_t std_meta,
    inout nv_empty_metadata_t user_meta,
    inout nv_empty_metadata_t pkt_out_meta
) {
    NvDirectCounter(NvCounterType.PACKETS_AND_BYTES) encap_counter;

    action deny() {
        encap_counter.count();
        nv_drop();
    }

    action to_port(nv_logical_port_t port) {
        encap_counter.count();
        nv_send_to_port(port);
    }

    action vxlan_v4_encap(nv_mac_addr_t underlay_src_mac, nv_mac_addr_t underlay_dst_mac,
                 nv_ipv4_addr_t underlay_sip, nv_ipv4_addr_t underlay_dip, bit<24> vni, nv_logical_port_t port) {
        nv_set_vxlan_v4_underlay(headers, underlay_dst_mac, underlay_src_mac, underlay_sip, underlay_dip, vni);
        encap_counter.count();
        nv_send_to_port(port);
    }
  
    table encap_v4_table {
        key = {
            headers.ipv4.dst_addr : exact;
        }
        actions = {
            vxlan_v4_encap;
            to_port;
            deny;
        }
        size = ENCAP_TABLE_SIZE;
        default_action = deny;
        direct_counter = encap_counter;
        nv_high_update_rate = true;
    }

    apply {
       if (headers.ipv4.isValid()) {
           encap_v4_table.apply();
       }
    }
}


/**
* This control is for packets from wire to host (RX)
* and includes policy for L2 decap
*/
control underlay_decap(
    inout headers_t headers,
    in nv_standard_metadata_t std_meta,
    inout nv_empty_metadata_t user_meta,
    inout nv_empty_metadata_t pkt_out_meta
) {
    NvDirectCounter(NvCounterType.PACKETS_AND_BYTES) decap_counter;

    action deny() {
        decap_counter.count();
        nv_drop();
    }

    action decap() {
        decap_counter.count();
        nv_l2_decap(headers);
    }

    action to_port(nv_logical_port_t port) {
        nv_send_to_port(port);
    }

    action decap_to_port(nv_logical_port_t port) {
        decap();
        to_port(port);
    }

    table decap_v4_table {
        key = {
            headers.vxlan.vni : exact;
        }
        actions = {
            decap;
            to_port;
            decap_to_port;
            deny;
            NoAction;
        }
        size = DECAP_TABLE_SIZE;
        direct_counter = decap_counter;
        default_action = NoAction;
        nv_high_update_rate = true;
    }

    apply {
        if (headers.vxlan.isValid()) {
            decap_v4_table.apply();
        }
    }
}

control gateway(
    inout headers_t headers,
    in nv_standard_metadata_t std_meta,
    inout nv_empty_metadata_t user_meta,
    inout nv_empty_metadata_t pkt_out_meta
) {
    overlay_encap() over;
    underlay_decap() under;

    /* user should add entries that correspond to the wire ports
     * A hit means this is an RX packet, miss means a TX packet
    */
    table direction_table {
        key = {
            std_meta.ingress_port : exact;
        }
        actions = {
            NoAction;
        }
        default_action = NoAction;
        const entries = {
            (WIRE_PORT) : NoAction();
        }
    }

    apply {
        if (direction_table.apply().hit) {
            under.apply(headers, std_meta, user_meta, pkt_out_meta);
        }
        else {
            over.apply(headers, std_meta, user_meta, pkt_out_meta);
        }
    }
}

NvDocaPipeline(
    packet_parser(),
    gateway()
) main;

This P4 application implements a basic VXLAN termination and origination function for IPv4 traffic. Its primary goal is to differentiate between incoming packets from the underlay network (Rx/Decapsulation) and packets originating from a host (Tx/Encapsulation), applying the necessary L2 overlay policies in each direction.

The program logic is separated into three distinct Control Blocks: gateway, underlay_decap, and overlay_encap .

  • gateway: Responsible for directing packets into the relevant control block (underlay_decap or overlay_encap) by matching on the ingress port.

  • underlay_decap: Responsible for L2 decapsulation of packets from wire to host (Rx) .

  • overlay_encap: Responsible for the overlay policy, including L2 VXLAN encapsulation of packets from host to wire (Tx) .

P4 Tables

Table Name

Control Block

Match Field

Actions

Purpose

direction_table

gateway

ingress_port

NoAction (default)

Determines the processing direction (Rx or Tx)

decap_v4_table 

underlay_decap

VxLAN.vni

decap, to_port, decap_to_port, deny, NoAction (default)

Core policy table for decapsulation; identifies the tenant context

encap_v4_table 

overlay_encap

IPv4.dst_addr

vxlan_v4_encap, to_port, deny (default)

Core policy table for encapsulation; determines the tunnel endpoint and VNI

We don't have a way to export this macro.

Direct Counters

Counter Name

Tied to Table

Actions Tracked

Function

decap_counter

decap_v4_table

decap, deny

Counts successfully decapsulated packets and denied packets

encap_counter

encap_v4_table

vxlan_v4_encap, deny

Counts successfully encapsulated packets and denied packets

Control Application (gateway.entries.json)

A control application manages the daemon's HW steering rules from a JSON input file that describes the desired rules.

{
    "doctype" : "gateway_shm.p4",
    "tables": {
    "encap_v4_table": {
        "entries": [
            {
                "match": {
                    "headers.ipv4.dst_addr": "6.6.6.4"
                },
                "action": "vxlan_v4_encap_encap_v4_table",
                "params": {
                    "underlay_src_mac": "3C:6D:66:11:11:11",
                    "underlay_dst_mac": "ff:ff:ff:ff:ff:ff",
                    "underlay_sip": "6.6.6.3",
                    "underlay_dip": "6.6.6.2",
                    "vni": "1",
                    "port": "0"
                }
            },
            {
                "match": {
                    "headers.ipv4.dst_addr": "6.6.6.5"
                },
                "action": "vxlan_v4_encap_encap_v4_table",
                "params": {
                    "underlay_src_mac": "3C:6D:66:11:11:11",
                    "underlay_dst_mac": "ff:ff:ff:ff:ff:ff",
                    "underlay_sip": "6.6.6.3",
                    "underlay_dip": "6.6.6.2",
                    "vni": "1",
                    "port": "0"
                }
            }
        ]
    },
    "decap_v4_table": {
        "entries": [
            {
                "match": {
                    "headers.vxlan.vni": "1"
                },
                "action": "decap_to_port_decap_v4_table",
                "params": {
                    "port": "1"
                }
            },
            {
                "match": {
                    "headers.vxlan.vni": "2"
                },
                "action": "deny_decap_v4_table"
            }
        ]
    }
    }
}

Pipeline

The following diagram shows a schematic of the gateway_shm.p4 program. The program defines dynamic decap_v4_table and encap_v4_table tables with no pre-defined rules. The DPL Runtime Controller uses gateway.entries.json to insert custom rules with the desired match values and actions into these tables.

image-2025-10-19_11-42-49.png

DOCA Libraries

This application leverages the following DOCA libraries:

Refer to their respective programming guide for more information.

Compiling the Application

Please refer to the DOCA Installation Guide for Linux for details on how to install BlueField-related software.

DOCA reference applications are installed with full source code and build instructions. This allows you to compile them as-is or modify the source code to create custom versions.

For more information about the applications as well as development and compilation tips, refer to the DOCA Reference Applications page.

The source code for the application is located in the following directory: 

/opt/dpl_rt_controller/samples/gateway_shm/

Prerequisites

  • The application relies on the dpl_rt_controller library.

  • The application relies on the json-c open source, requiring the following package to be installed: 

    sudo apt install libjson-c-dev
    
  • DPL Development Container installed on the host. See the DPL Installation Guide for more details.

Compiling the Application

To build the Gateway application using the dpl-dev container and the DPL compiler:

  1. On the x86 host, compile the DPLgateway_shm.p4 program:

    dplp4c.sh --target doca --odir /tmp/gateway_shm_out gateway_shm.p4 

    Because this application relies on the Shared Memory (SHM) Interface, it is mandatory for the application to run directly on the DPU . Therefore, the compiler's output must be copied from the x86 host to the DPU's file system (e.g., /tmp/gateway_shm_out) before compiling the application

  2. On the DPU Arm, compile the Gateway "C" application:

    cd /opt/dpl_rt_controller/samples/gateway_shm
    meson /tmp/gateway_shm -Dsample_programs_out=/tmp/gateway_shm_out
    ninja -C /tmp/gateway_shm

The dpl_sample_gateway_shm application is created in /tmp/gateway_shm.

Running the Application

The application is provided in source form and requires compilation before execution . For details, refer to section "DOCA Pipeline Language Runtime Controller Gateway SHM Application Guide | Compiling the Application".

Prerequisites

Application Execution

  1. Start the DPL Runtime Service as detailed in the DPL Container Deployment.

  2. Run the gateway application. Usage:

    Bash
    ./dpl_sample_gateway_shm <device_id> <p4info path> <program path> <json_entries_path>
    

    The device id (first argument) should match the ID of the DPL device as configured at /etc/dpl_rt_service/devices.d/.

    Example:

    Bash
    sudo /tmp/gateway_shm/dpl_sample_gateway_shm 1000 /tmp/gateway_shm_out/gateway_shm.p4info.txt /tmp/gateway_shm_out/gateway_shm.dplconfig gateway.entries.json
    

Application Code Flow

  1. gateway_main.cc:
    Parses the received arguments and executes doca_error_t gateway(uint32_t device_id, const char *p4info_path, const char *blob_path, const char *json_entries_path), responsible for the main logic. 

  2.  gateway_sample.cc:
    Implements doca_error_t gateway() and runs the main logic:Load the compiled program and p4info using gRPC:   DPL_P4RT_Controller::Controller ctrl = DPL_P4RT_Controller::Controller(device_id, "localhost:9559"); ctrl.LoadProgram(p4info_path, blob_path); Connect to the DPL device: dpl_rt_controller_connect(&connect_attr, &device); Insert the entries from gateway.entries.json using the SHM interface: doca_error_t insert_entries_from_json(device, json_entries_path, entries) { ... // Process encap_v4_table/decap_v4_table entries    for (int i = 0; i < entries_count; i++) {        json_object *entry_obj = json_object_array_get_idx(entries_array, i);    struct dpl_rt_controller_entry *dpl_entry = nullptr; dpl_shm_gateway_shm_gateway_over_encap_v4_table_entry_t *table_entry = nullptr; dpl_shm_gateway_shm_gateway_over_encap_v4_table_alloc_entry_mem(device, &dpl_entry, &table_entry); ... /* Parse match fields, actions and parameters to construct the table_entry */ ...            // Insert the entry using the SHM API         dpl_shm_gateway_shm_insert_entry(dpl_entry, nullptr);         ... } } The program enters an interactive loop: while (!quit) {     std::cout << "Press a key (e=entries, c=counters, q=quit): " << std::flush;     ... // Display all entries on 'E'/'e' key press    display_all_entries(entries); ... // Read counter data on 'C'/'c' key press    read_entry_counters(device, entries); ... } The system is now configured. Test packets can be sent (e.g., via scapy): sendp(Ether(src=\"00:11:11:11:11:11\", dst=\"00:22:22:22:22:22\") / IP(src=\"192.168.1.1\", dst=\"192.168.1.2\"), iface=\"eth2\", count=50) On 'E'/'e' key press, all table entries are displayed: void display_all_entries(entries) { ... // Iterate over entries per table ("encap_v4_table"/"decap_v4_table")    for (const auto& table_pair : entries) {    const std::string& table_name = table_pair.first; const std::vector<struct dpl_rt_controller_entry*>& table_entries = table_pair.second; ... // Iterate over entry handles    for (size_t i = 0; i < table_entries.size(); i++) {             void *shm_entry = dpl_rt_controller_table_entry_get_shm(table_entries[i]); ...             /* Extract & display entry data using the controller interface */ ... } } } On 'C'/'c' key press, counter data for each entry is read: The application displays counter values for each entry because gateway_shm.p4 uses direct_counters. However, counters are only actively updated by the decap, deny, and vxlan_v4_encap actions . Therefore, only entries using these actions will show updated values.On quit:Cleanup entries: doca_error_t cleanup_entries(entries) { ... // Iterate over entries per table for (const auto& table_pair : entries) { ... // Iterate over entry handles for (size_t i = 0; i < table_entries.size(); i++) { // Delete entry using the controller interface dpl_rt_controller_table_entry_delete(table_entries[i]); ... } } } Disconnect from device and detach: dpl_rt_controller_disconnect(device); dpl_rt_controller_detach();

References

Last updated: