On the second day of the DeepSeek Open Source Week, the DeepEP communication library was officially released. Designed specifically for training and inference of Mixture of Experts (MoE) models, DeepEP is developed by the DeepSeek team to enhance communication efficiency when handling large-scale data and complex tasks.
DeepEP has been optimized for MoE models, offering efficient all-to-all GPU communication kernels suitable for distribution and aggregation operations. These kernels support NVLink and RDMA communications within and between nodes, ensuring high-efficiency data transmission among different experts. Notably, DeepEP demonstrates exceptional performance when implementing the group-restricted gating algorithm proposed in the DeepSeek-V3 paper.
In terms of data formats, DeepEP supports low-precision formats such as FP8 and BF16, which help improve computational efficiency while reducing memory requirements. Additionally, the library introduces a hook-based communication-computation overlap method that does not占用GPU的计算资源, thereby maximizing computational efficiency.
Performance-wise, DeepEP achieves significant results in both high throughput and low latency. Tests conducted on H800 GPUs with CX7 InfiniBand 400 Gb/s RDMA network cards show that intra-node communication bottlenecks reach 153 GB/s for distribution and 158 GB/s for aggregation, while inter-node communication bottlenecks range from 43-47 GB/s. For low latency, DeepEP is specially designed for inference decoding using pure RDMA technology. It achieves a distribution operation latency of 163 microseconds and an aggregation operation latency of 318 microseconds when handling 8 experts. As the number of experts increases, latency slightly rises but remains at a low level even with 256 experts.
Regarding system compatibility, DeepEP primarily works with InfiniBand networks and can run on converged Ethernet (RoCE). The library requires Hopper architecture GPUs, Python 3.8 or higher, CUDA 12.3 or higher, and PyTorch 2.1 or higher.
The release of DeepEP marks another important milestone for DeepSeek in the open-source community, providing strong support for MoE model research and application. As the DeepSeek Open Source Week continues, more open-source tools and libraries related to MoE models are expected to be launched, further advancing this field.
Related Information: