DOCA SDK Documentation

DOCA Flow Connection Tracking

This guide provides an overview and configuration instructions for the DOCA Flow Connection Tracking (CT) API.

Introduction

The DOCA Flow Connection Tracking (CT) module is a 5-tuple table designed to efficiently track network connections using hardware resources. It supports the following key features:

  • Track zone and 5-tuple sessions – Track and manage network connections based on a 5-tuple (source IP, destination IP, source port, destination port, and protocol) along with zone-based separation

  • Zone-based virtual tables – Enable logical isolation using zones

  • Aging support – Remove idle connections automatically using configurable timeouts

  • Connection metadata – Set and manage metadata for tracked connections

  • Bidirectional packet handling – Manage traffic in both directions of a connection

  • High connection rate – Efficiently handle a high rate of connections per second (CPS)

The CT module makes it simple and efficient to track connections by leveraging hardware resources. 

Architecture

The DOCA Flow CT Pipe is designed to handle non-encapsulated TCP and UDP packets. It supports two primary actions:

  • Forward to next pipe – For packets that match a known 6-tuple connection (5-tuple + zone)

  • Miss to next pipe – For packets without a matching connection entry

The application is responsible for handling packets based on these outcomes.

The DOCA Flow CT API consists of four major components:

  • CT module manipulation – Configure and manage resources within the CT module

  • CT connection entry manipulation – Add, remove, or update connection entries efficiently

  • Callbacks – Handle asynchronous processing results for connection entries

  • Pipe and entry statistics – Monitor connection tracking performance using pipe-level and entry-level statistics

These components provide flexible control over connection tracking and monitoring, allowing applications to adapt to various network scenarios effectively.

arch-diagram.png

Aging

Aging time refers to the maximum duration (in seconds) a session can remain active without detecting any packets. If no packets are observed within this period, the session is terminated.

To support aging, a dedicated aging thread is launched. This thread polls and checks counters for all active connections, ensuring that stale sessions are removed efficiently.

When aging is enabled, either the counter flag or a non-zero timeout must be set for at least one connection entry to trigger session expiration.

Managed Mode

In Managed Mode, the application is responsible for:

  • Managing worker threads

  • Parsing and handling connection lifecycles

This mode utilizes DOCA Flow CT management APIs for creating and destroying connections.

The CT aging module automatically notifies the application of aged-out connections by invoking callbacks.

Connection Rules and Management

Users have the flexibility to create connection rules with different patterns, metadata (meta), or counters which can be applied separately for each packet direction.

Users must manually define the appropriate meta and mask values for matching (match) and modifying (modify) packets.

To create rules in stages:

  1. Create one rule for a connection using the standard API.

  2. Add a second rule for the opposite packet direction using the doca_flow_ct_entry_add_dir() API.

managed-mode-diagram.png

Processing CT Entries

DOCA Flow provides specialized APIs to process CT entries using a dedicated queue:

  • doca_flow_entries_process – Processes pipe entries in the queue

  • doca_flow_aging_handle – Handles the aging of pipe entries

Some APIs, such as CT entry status queries and pipe miss queries, are not supported in Managed Mode.

Prerequisites

DPU

To enable DOCA Flow CT on the DPU, perform the following on the Arm:

  1. Enable iommu.passthrough in Linux boot commands (or disable SMMU from the DPU BIOS): 

    1. Run:

      sudo vim /etc/default/grub
      


    2. Set GRUB_CMDLINE_LINUX="iommu.passthrough=1".

    3. Run: 

      sudo update-grub
      sudo reboot
      


  2. Configure DPU firmware with LAG_RESOURCE_ALLOCATION=1:

    sudo mlxconfig -d <device-id> s LAG_RESOURCE_ALLOCATION=1
    


    Retrieve device-id from the output of the mst status -v command. If, under the MST tab, the value is N/A, run the mst start command.


  3. Update /etc/mellanox/mlnx-bf.conf as follows:

    ALLOW_SHARED_RQ="no"
    


  4. Perform power cycle on the host and Arm sides.

  5. If working with a single port, set the DPU into e-switch mode: 

    sudo devlink dev eswitch set pci/<pcie-address> mode switchdev
    sudo devlink dev param set pci/<pcie-address> name esw_multiport value false cmode runtime
    


    Retrieve pcie-address from the output of the mst status -v command.


  6. If working with two PF ports, set the DPU into multi-port e-switch mode (for the 2 PCIe devices):

    sudo devlink dev param set pci/<pcie-address> name esw_multiport value true cmode runtime
    


    Retrieve pcie-address from the output of the mst status -v command.


  7. Define huge pages (see DOCA Flow prerequisites).

ConnectX

To enable DOCA Flow CT on the NVIDIA® ConnectX®, perform the following:

  1. Configure firmware with LAG_RESOURCE_ALLOCATION=1:

    sudo mlxconfig -d <device-id> s LAG_RESOURCE_ALLOCATION=1
    


    Retrieve device-id from the output of the mst status -v command. If, under the MST tab, the value is N/A, run the mst start command.


  2. Perform power cycle.

  3. If working with a single port:

    sudo devlink dev eswitch set pci/<pcie-address> mode switchdev
    sudo devlink dev param set pci/<pcie-address> name esw_multiport value false cmode runtime
    


    Retrieve pcie-address from the output of the mst status -v command.


  4. If working with two PF ports:

    sudo devlink dev eswitch set pci/<pcie-address0> mode switchdev
    sudo devlink dev eswitch set pci/<pcie-address1> mode switchdev
    sudo devlink dev param set pci/<pcie-address0> name esw_multiport value true cmode runtime
    sudo devlink dev param set pci/<pcie-address1> name esw_multiport value true cmode runtime
    


    Retrieve pcie-address from the output of the mst status -v command.


  5. Define huge pages (see DOCA Flow prerequisites).

Actions

DOCA Flow CT supports actions based on meta and NAT operations. Each action can be defined as either shared or non-shared.

Action descriptors are not supported.

Shared Actions

Actions that can be shared between entries. Shared actions are predefined and reused in multiple entries.

The user gets a handle per shared action created and uses this handle as a reference to the action where required. 

It is user responsibility to track shared actions and to remove them when they become irrelevant.

Shared actions are defined using a control queue (see DOCA Flow Connection Tracking | struct doca_flow_ct_cfg).

Non-shared Actions

Actions provided with their data during entry create/update.

These actions are completely managed by DOCA Flow CT and cannot be reused in multiple flows (i.e., NAT operations).

Action Sets in Pipe Creation

When creating a DOCA Flow CT pipe, users must define action sets, just as they would for any other pipe.

Fields in the CT pipe must be marked as CHANGEABLE during pipe creation. This allows the actual criteria for these fields to be specified later during entry creation.

Only actions related to meta and NAT, as defined in DOCA Flow Connection Tracking | struct doca_flow_ct_actions, are supported.

During entry creation or update, different actions can be specified for each direction, allowing variations in action content and/or action type.

Feature Enable

To enable user actions, administrators must configure the following parameters:

  • User action templates must be configured during the DOCA Flow CT pipe creation phase.

  • The maximum memory allocated for user actions (actions_mem_size) must be defined during DOCA Flow CT initialization.

Using Actions in Managed Mode

Init

When calling doca_flow_ct_init(), you must configure the following parameters:

  • nb_ctrl_queues: The total number of control queues dedicated to defining shared actions.

  • actions_mem_size: The maximum amount of memory (in bytes) allocated for user actions. This value must be strictly 64-byte aligned, and NVIDIA highly recommends utilizing a power of 2.

Create DOCA Flow CT Pipe

Configure actions sets on doca_flow_pipe_create().

Create Shared Actions

Use doca_flow_ct_actions_add_shared() with one of the control queues.

Shared actions can be added at any time before use.

Add Entry

Entry can be created in one of the following ways:

  • Using an action handle of a predefined shared action

  • Using action data, which is specific to the flow, not sharable (e.g., for NAT operations)

The entry can have different actions and/or different action types per direction.

Remove Entry

Non-shared actions associated with an entry are implicitly destroyed by DOCA Flow CT.

Shared actions are not destroyed. They can be used by the user until they decide to remove them.

Update Entry

Entry actions can be updated per direction. All combinations of shared/non-shared actions are applicable (e.g., update from shared to non-shared).

Changeable Forward

DOCA Flow CT permits the use of a different forward pipe for each flow direction. The module operates at one of two mutually exclusive forwarding levels:

  • Pipe level – A single forward pipe is defined during DOCA Flow CT pipe creation and applies to all entries universally.

  • Entry level – The forward pipe is defined dynamically during entry creation.

Entry-level forwarding characteristics:

  • It exclusively supports DOCA_FLOW_FWD_PIPE and DOCA_FLOW_FWD_ORDERED_LIST_PIPE (fixed pipe, changeable index).

  • It supports defining a distinct forward pipe per flow direction (both directions can utilize the same or different forward pipes).

  • Because there is no default forward pipe, forwarding destinations must be explicitly set upon each entry creation.

Enabling Changeable Forwarding

To enable this feature, create the DOCA Flow CT pipe using one of the following configurations:

  • Standard pipe:

    • Set forward type to DOCA_FLOW_FWD_PIPE 

    • Set next_pipe to NULL 

  • Ordered list pipe:

    • Set forward type to DOCA_FLOW_FWD_ORDERED_LIST_PIPE 

    • Set ordered_list_pipe.pipe to <ol_pipe> 

    • Set ordered_list_pipe.idx to UINT32_MAX 

Using Changeable Forward in Managed Mode

To utilize changeable forwarding in Managed Mode, execute the following sequence:

  1. Initialize CT by calling doca_flow_ct_init().

  2. Create pipe by calling doca_flow_pipe_create() using the changeable forwarding configurations described above.

  3. Add entry by calling doca_flow_ct_add_entry(). During this step, set fwd_origin and/or fwd_reply to your desired targets.

  4. Update entry by calling doca_flow_ct_update_entry() to update the forwarding for a specific entry direction.

When updating the forward destination, you must explicitly pass all other parameters with their previously existing values.

Entry Iterator

When iterator support is enabled, DOCA Flow CT can traverse all entries on a CT pipe using a registered callback. For each invocation, the application retrieves full entry data (match, hash, flags) via doca_flow_ct_get_entry() and can recreate or mirror those entries.

A primary use case for this is High-Availability (HA) Synchronization: the application reads every active entry from the CT on the active node and programs matching entries on the standby node to preserve connection state during failovers. Iteration is incremental; the application drives progress by calling doca_flow_ct_entries_process() per queue, and the registered callback executes as entries are dispatched.

Enabling Iterator

  1. Create or configure the CT with the DOCA_FLOW_CT_FLAG_ITERATOR flag. 

    The CT duplication filter is backed by a hash table of active entries to prevent duplicate insertions while iteration and forwarding run concurrently.

  2. Start pipe-level iteration by calling doca_flow_ct_pipe_iterate(ct_pipe, iterate_cb, iterate_usr_ctx). This schedules the iteration across all queues for the given pipe.

  3. For each CT queue, call doca_flow_ct_entries_process() and pass the max_processed_entries limit. This processes the hardware steering queue and invokes iterate_cb as entries are delivered.

    • Inside the callback, read the entry details via doca_flow_ct_get_entry() to obtain the matcher, hash, and flags for standby replication.

    • If the number of processed entries returned is less than the requested max_processed_entries, the iteration for that specific queue has reached its end for the current pass.

  4. Pipe iteration formally completes once all participating queues have finished the incremental processing steps and no further callbacks are pending.

Iterator Limitations

  • Action exclusion: Entry actions are not exported through the iterator path. The application must manually retain or reconstruct CT entry actions on the standby node.

  • Incomplete passes: New entries created during or immediately after a walk are not guaranteed to be captured in a single iterator pass. Applications should track new entries independently and not rely solely on the iterator for complete HA synchronization.

API

For the library API reference, refer to DOCA Flow and CT API documentation in the .

DOCA Flow CT is in the DOCA Flow library.

The following sections provide additional details about the library API.

enum doca_flow_ct_flags

Optional DOCA Flow CT configuration flags.

Flag

Description

DOCA_FLOW_CT_FLAG_STATS

Enables internal pipe counters for packet tracking. Call doca_flow_pipe_dump(<ct_pipe>) to dump the changed counter values.

DOCA_FLOW_CT_FLAG_WORKER_STATS

Enables the periodic dump of worker thread internal debug counters.

DOCA_FLOW_CT_FLAG_NO_AGING

Disables aging.

DOCA_FLOW_CT_FLAG_ASYMMETRIC_TUNNEL

Allows utilizing tunnel or non-tunnel configurations in different directions.

DOCA_FLOW_CT_FLAG_NO_COUNTER

Disables counters and aging entirely to save aging-thread CPU cycles.

DOCA_FLOW_CT_FLAG_ITERATOR

Enables the entry iterator.

DOCA_FLOW_CT_FLAG_DUP_FILTER_UDP_ONLY

Applies the connection duplication filter strictly for UDP connections.

DOCA_FLOW_CT_FLAG_ORIGIN_WIRE

Indicates origin traffic will arrive from the wire. If set, mark actions can be utilized in the origin direction.

DOCA_FLOW_CT_FLAG_REPLY_WIRE

Indicates reply traffic will arrive from the wire. If set, mark actions can be utilized in the reply direction.

enum doca_flow_ct doca_flow_ct_entry_flags

Optional DOCA Flow CT entry flags.

Flag

Description

DOCA_FLOW_CT_ENTRY_FLAGS_NO_WAIT = (1 << 0)

Entry is not buffered; send to hardware immediately

DOCA_FLOW_CT_ENTRY_FLAGS_DIR_ORIGIN = (1 << 1)

Apply flags to origin direction

DOCA_FLOW_CT_ENTRY_FLAGS_DIR_REPLY = (1 << 2)

Apply flags to reply direction

DOCA_FLOW_CT_ENTRY_FLAGS_IPV6_ORIGIN = (1 << 3)

Origin direction is IPv6; origin match union in struct doca_flow_ct_match is IPv6

DOCA_FLOW_CT_ENTRY_FLAGS_IPV6_REPLY = (1 << 4)

Reply direction is IPv6; reply match union in struct doca_flow_ct_match is IPv6

DOCA_FLOW_CT_ENTRY_FLAGS_COUNTER_ORIGIN = (1 << 5)

Apply counter to origin direction

DOCA_FLOW_CT_ENTRY_FLAGS_COUNTER_REPLY = (1 << 6)

Apply counter to reply direction

DOCA_FLOW_CT_ENTRY_FLAGS_COUNTER_SHARED = (1 << 7)

Counter is shared for both direction (origin and reply)

DOCA_FLOW_CT_ENTRY_FLAGS_FLOW_LOG = (1 << 8)

Enable flow log on entry removed

DOCA_FLOW_CT_ENTRY_FLAGS_ALLOC_ON_MISS = (1 << 9)

Allocate on entry not found when calling doca_flow_ct_entry_prepare() API

DOCA_FLOW_CT_ENTRY_FLAGS_DUP_FILTER_ORIGIN = (1 << 10)

Enable duplication filter on origin direction

DOCA_FLOW_CT_ENTRY_FLAGS_DUP_FILTER_REPLY = (1 << 11)

Enable duplication filter on reply direction

enum doca_flow_ct_rule_opr

Options for handling flows in autonomous mode with shared actions. The decision is taken on the first flow packet.

Operation

Description

DOCA_FLOW_CT_RULE_OK

Flow should be defined in the CT pipe using the required shared actions handles

DOCA_FLOW_CT_RULE_DROP

Flow should not be defined in the CT pipe. The packet should be dropped.

DOCA_FLOW_CT_RULE_TX_ONLY

Flow should not be defined in the CT pipe. The packet should be transmitted.

struct direction_cfg

Managed mode configuration for origin or reply direction.

Field

Description

bool match_inner

5-tuple match pattern applies to packet inner layer

struct doca_flow_meta *zone_match_mask

Mask to indicate meta field and bits to match

struct doca_flow_meta *meta_modify_mask

Mask to indicate meta field and bits to modify on connection packet match

doca_flow_ct_cfg

DOCA Flow CT configuration lifecycle manipulation:

struct doca_flow_ct_cfg *ct_cfg;
ret = doca_flow_ct_cfg_create(&ct_cfg);
doca_flow_ct_cfg_set_flags(ct_cfg, flags);
doca_flow_ct_cfg_set_queues(ct_cfg, n_queues);
/* ... */
ret = doca_flow_ct_init(ct_cfg);

final:
ret = doca_flow_ct_cfg_destroy(ct_cfg);

Configuration API methods:

Function

Description

doca_flow_ct_cfg_create

Creates the CT configuration object.

doca_flow_ct_cfg_destroy

Destroys the CT configuration object.

doca_flow_ct_cfg_set_flags

Sets the CT flags (refer to enum doca_flow_ct_flags).

doca_flow_ct_cfg_set_queues

Sets the number of hardware queues utilized to manipulate connections.

doca_flow_ct_cfg_set_queue_depth

Sets the queue depth (defaults to 512 rules).

doca_flow_ct_cfg_set_ctrl_queues

Sets the number of CT control queues used for defining shared actions.

doca_flow_ct_cfg_set_actions_mem_size

Sets the total CT actions memory size in bytes.

doca_flow_ct_cfg_set_entry_private_data_size

Sets the size of user private data allocated per connection.

doca_flow_ct_cfg_set_entry_finalize_cb

Sets the entry finalize callback to query final connection statistics.

doca_flow_ct_cfg_set_status_update_cb

Sets the status update callback to notify the application of counter changes.

doca_flow_ct_cfg_set_aging_core

Defines the specific CPU core ID to bind the CT aging thread to.

doca_flow_ct_cfg_set_aging_query_delay

Sets the CT aging query delay for newly created connections.

doca_flow_ct_cfg_set_aging_plugin_ops

Defines custom aging logic callbacks (falls back to default logic if omitted).

doca_flow_ct_cfg_set_direction

Configures the origin and reply directions.

Additional configuration notes:

  • CT session-related fields are governed by doca_flow_pipe_cfg and are configured via:

    • doca_flow_pipe_cfg_set_ct_connections()

    • doca_flow_pipe_cfg_set_ct_max_connections_per_zone()

    • doca_flow_pipe_cfg_set_ct_dup_filter_size()

  • CT counter configuration: DOCA Flow must be configured in per-port mode using doca_flow_cfg_set_resource_mode(cfg, DOCA_FLOW_RESOURCE_MODE_PORT). Define the number of CT counters via doca_flow_port_cfg_set_nr_resources(port_cfg, DOCA_FLOW_RESOURCE_COUNTER_CT, <n>).

struct doca_flow_ct_actions

This structure is used in the following cases:

  • For defining shared actions. In this case, action data is provided by the user. The action handle is returned by DOCA Flow CT.

  • For defining an entry with actions. The structure can be filled with two options:

    • With action handle of a previously created shared action

    • With non-shared action data

DOCA Flow CT action structure.

enum doca_flow_resource_type  resource_type;
union {
			   /* Used when creating an entry with a shared action. */
               uint32_t action_handle;

               /* Used when creating an entry with non-shared action or when creating a shared action. */
               struct {   
                          uint32_t action_idx;
                          struct doca_flow_meta meta;
                          struct doca_flow_header_l4_port l4_port;
                          union {
                                    struct doca_flow_ct_ip4 ip4;
                                    struct doca_flow_ct_ip6 ip6;
                           };
               } data;
       };

Where:

Field

Description

enum doca_flow_resource_type resource_type

Shared/non-shared action

uint32_t action_handle

Shared action handle

uint32_t action_idx

Actions template index

struct doca_flow_meta meta

Modify meta values

struct doca_flow_header_l4_port l4_port

UDP or TCP source and destination port

struct doca_flow_ct_ip4 ip4

Source and destination IPv4 addresses

struct doca_flow_ct_ip6 ip6

Source and destination IPv6 addresses

The value in meta, l4_port, ip4, and ip6 should start from bit0, the least significant bit, regardless of which bits are set in mask. For example, action_val.meta.u32[0] = DOCA_HTOBE32(0x12), action_mask.meta.u32[0] = DOCA_HTOBE32(0x0000FF00) sets bits 15-8 to 0x12.

DOCA Flow Connection Tracking Samples

This section describes DOCA Flow CT samples based on the DOCA Flow CT pipe.

The samples illustrate how to use the library API to manage TCP/UDP connections. 

All the DOCA samples described in this section are governed under the BSD-3 software license agreement.

Running the Samples

  1. Refer to the following documents:

  2. To build a given sample, run the following command. If you downloaded the sample from GitHub, update the path in the first line to reflect the location of the sample file: 

    cd /opt/mellanox/doca/samples/doca_flow/flow_ct_udp
    meson /tmp/build
    ninja -C /tmp/build
    


    The binary doca_flow_ct_udp is created under /tmp/build/samples/.

  3. Sample (e.g., doca_flow_ct_udp) usage: 

    Usage: doca_<sample_name> [DOCA Flags] [Program Flags]
       
    DOCA Flags:
      -h, --help                              Print a help synopsis
      -v, --version                           Print program version information    
      -l, --log-level                         Set the (numeric) log level for the program <10=DISABLE, 20=CRITICAL, 30=ERROR, 40=WARNING, 50=INFO, 60=DEBUG, 70=TRACE>
      --sdk-log-level                         Set the SDK (numeric) log level for the program <10=DISABLE, 20=CRITICAL, 30=ERROR, 40=WARNING, 50=INFO, 60=DEBUG, 70=TRACE>
      -j, --json <path>                       Parse command line flags from an input json file
        
    Program Flags:
      -p, --pci_addr <PCI-ADDRESS>            PCIe device address
    
  4. For additional information per sample, use the -h option: 

    /tmp/build/samples/<sample_name> -h
    

The following is a CLI example for running the samples when port 08:00.0 is configured (multi-port e-switch) as manager port: 

/tmp/build/samples/doca_<sample_name> -- -r pci/08:00.0,pf0vf0 -l 60

The following is a CLI example for running the samples when port 08:00.0 is configured (multi-port e-switch) as manager port and 08:00.1is configured as the representor of the second uplink:

/tmp/build/samples/doca_<sample_name> -- -r pci/08:00.1 -l 60

To avoid the test being impacted by unexpected packets, it only accepts packets like the following examples:

  • IPv4 destination address is 1.1.1.1

  • IPv6 destination address is 0101:0101:0101:0101:0101:0101:0101:0101

Samples List

Sample Name

Description

doca_flow_ct_2_ports

Deploys two independent e-switches, each maintaining its own distinct CT state and pipeline.

doca_flow_ct_aging

Demonstrates CT aging using a pipe with entries that feature variable aging times and custom user data.

doca_flow_ct_iterator

Iterates through the CT pipe across two standalone e-switches.

doca_flow_ct_tcp

Utilizes CT in conjunction with TCP flags for robust session handling.

doca_flow_ct_tcp_actions

Attaches both shared and non-shared actions to a TCP CT implementation.

doca_flow_ct_tcp_entry_finalize

Leverages the CT entry finalize callback when sessions terminate or are manually removed.

doca_flow_ct_tcp_ipv4_ipv6

Handles complex flows where each packet direction utilizes a different IP version.

doca_flow_ct_tcp_wire_to_wire

Executes a mark action on the CT for a strictly wire-to-wire TCP path.

doca_flow_ct_udp

Deploys a basic UDP pipeline that natively incorporates a CT pipe.

doca_flow_ct_udp_query

Queries the Flow CT UDP session state based on the origin or reply direction.

doca_flow_ct_udp_single_match

Creates a hardware CT entry applying a single-direction match within doca_flow_ct_add_entry().

doca_flow_ct_udp_tunnel_asymmetric

Implements an asymmetric tunnel mode for Flow CT (an extension of the core UDP query sample).

doca_flow_ct_udp_update

Dynamically updates CT entries post-creation, allowing inactive UDP sessions to receive updated aging timeouts.

Last updated: