Unified Collective Communication (UCC) was codesigned with industry partners for PyTorch-based deep learning recommender model training on multi-rail GPU platforms. UCC has been specifically designed and implemented for high-performance PGAS applications and runtimes. It serves as a replacement for HCOLL. UCC fully implements the range of HCOLL's hierarchical algorithms as well as full support for Nvidia's GPUs.
For further information on what UCC is and how to use it, please see https://github.com/openucx/ucc
Please see UCC PyTorch integration layer, Torch_UCC at https://github.com/facebookresearch/torch_ucc
UCC is integrated into PyTorch via the UCC process group.
UCC is supported in both MPI and OSHMEM and enabled by default.
-
To toggle in MPI, set
-mca coll_ucc_enableto 1/0. -
To toggle in OSHMEM, set
-mcacoll_scoll_enableto 1/0. -
You can additionally set collective frameworks via the
--mca coll
eg.--mca coll ucc,basic,libnbcfor OMPI or--mca scoll ucc,basic,libnbcfor OSHMEM
Last updated: