#gpu
Every summary, chronological. Filter by category, tag, or source from the rail.
Tag · #gpu
mKernel: Fusing Compute and Communication for GPU-Driven Scaling
mKernel eliminates host-driven communication bottlenecks by fusing intra-node NVLink, inter-node RDMA, and compute into persistent CUDA kernels, enabling fine-grained overlap at the tile level.
MarkTechPost
Showing 1 of 1