Changes and New Features
|
Feature/Change |
Description |
|
High Availability in sharp_am Network Interfaces |
sharp_am leverages multiple network interfaces of the management host to provide high availability in case of a network interface failure. |
|
Reliable Multicast |
Added support for SHARP to leverage reliable multicast option with NVIDIA Quantum-2. |
|
SM Data |
Removed support for reading sm data by a client application. The API functions sharp_request_sm_data, sharp_get_sm_data_buf_len, and sharp_get_sm_data have been removed and can no longer be used. In addition, the configuration parameter ftree_ca_order_file is ignored in sharp_am. |
|
Bug Fixes |
See Bug Fixes section. |
Parameter Changes
|
Parameter |
Component |
Description |
|---|---|---|
|
ignore_host_guids_file |
sharp_am |
New parameter: File with a list of Host GUIDs to be ignored for SHARP trees. Default: Null. |
|
ignore_sm_guids |
sharp_am |
New parameter: A boolean parameter, telling whether SM GUIDs need to be ignored in SHARP trees parsed from SMDB file. Default: True. |
|
ftree_ca_order_file |
sharp_am |
Deprecated parameter: This parameter is now marked as deprecated, it is ignored and should not be used. |
|
enable_sat |
sharp_am |
Deprecated parameter: This parameter controlled whether SHARP should allow SAT jobs. The parameter is now marked as deprecateI. it is ignored and should not be used. SAT is always supported. |
|
SHARP_COLL_SERIALIZE_MADS |
libsharp |
New parameter: Serialize sharp MADs in tree connect and group join operations, it is recommended to set this flag to true when running mpirun with multiple groups. Default: False. |
|
SHARP_COLL_JOB_REQUEST_RMC |
libsharp |
New parameter: If set to True, require that any allocated SHARP trees will support the Reliable Multicast feature. Default: False. |
|
SHARP_COLL_FORCE_BCAST_AS_ALLREDUCE |
libsharp |
New parameter: Force Bcast(rmc) as Allreduce operation Default: False. |
Last updated: