NVIDIA Scalable Hierarchical Aggregation and Reduction Protocol (SHARP)

Changes and New Features

Changes and New Features

Feature/Change

Description

DragonFly (DFP+) Topologies with Dynamic Trees Support

Added support for DFP+ topologies while sharp_am operates in dynamic trees mode.

SHARP Reservation Resource Limit

Modified the logic of SHARP resource limit per reservation by adding a new parameter to control the number of jobs per HCA (see reservation_max_jobs_per_hca below).
For further information, please see SHARP Reservation section.

Bug Fixes

See 

Bug Fixes

.

Parameter Changes

Parameter

Component

Description

reservation_max_jobs_per_hca

sharp_am

New parameter: A numeric parameter. Tells the maximum number of allowed jobs that can use the same HCA.

A value of 0 means no limit.

Valid range: 0-511.

Applies only while operating in reservation mode.

Default: 1 job per HCA.

dynamic_tree_algorithm

sharp_am

Sets which algorithm should be used by the dynamic tree mechanism.

Modified value 1 to include support for DragonFly topologies.

Current values:

0 - Regular FatTree oriented algorithm

1 - Quasi Fat Tree or DragonFly oriented algorithm


Last updated: