AWS OFI NCCL v1.12.0
AmedeoSapio
released this
08 Oct 01:41
·
228 commits
to master
since this release
This release is intended only for use on AWS P* instances. A general release that supports other Libfabric networks will be made in the near future. This release requires Libfabric v1.18.0 or later and supports NCCL 2.23.4-1 while maintaining backward compatibility with older NCCL versions (NCCL v2.17.1 and later).
New Features:
- Support for tuner v3 APIs
- Support for AllGather and ReduceScatter in the tuner
- Support for PAT algorithm in the tuner
Bug fixes:
- Fixed NULL pointer access in the endpoint per communicator path
- Replaced the NVLSTree option in the tuner with RING if nRanks==nNodes
The plugin has been tested with following libfabric providers using tests bundled in the source code and nccl-tests suite:
- efa
Checksum (sha512) for the release tarball:
7d9e41ce04253a32a13542e7f4c2d20c2a5a43cdfb575fe153954c5faed8cf85eb08dab76ee0f883109f7610bb43cb8b703fe2f1e98b8f02bbfa866dd1c268e1 aws-ofi-nccl-1.12.0-aws.tar.gz