We tried to remove this in the following PR: #1600, but this caused OOMs on previously working configs. However, it should in theory be possible to remove this to enable NCCL optimizations. A deeper investigation into when this causes increased memory pressure/fragmentation is needed.
We tried to remove this in the following PR: #1600, but this caused OOMs on previously working configs. However, it should in theory be possible to remove this to enable NCCL optimizations. A deeper investigation into when this causes increased memory pressure/fragmentation is needed.