
NCCL Inspector plugin brings always-on AI observability
NVIDIA introduced the NCCL Inspector plugin to give teams always-on visibility into distributed AI training and inference. The NCCL Inspector plugin logs detailed metrics for every communicator and rank, which directly targets hidden bottlenecks in collective operations. Moreover, The release builds on the new plugin interface in NCCL 2.23, enabling fast adoption without patching training […]







