-
1.
公开(公告)号:US20250036920A1
公开(公告)日:2025-01-30
申请号:US18026140
申请日:2022-09-20
Inventor: Liang SHEN , Haifeng WANG , Huachao WU , Weibao GONG , Zhihua WU , Dianhai YU
IPC: G06N3/045 , G06N3/0495
Abstract: The present disclosure provides a mixture-of-experts (MoE) model implementation method and system, an electronic device, and a storage medium, and relates to the field of artificial intelligence (AI) such as deep learning and distributed storage. The method includes: constructing a communication group, the communication group including a tensor-parallelism communication group, the tensor-parallelism communication group including at least two computing devices, tensor-parallelism segmentation being adopted for sparse parameters of each of the computing devices in a same tensor-parallelism communication group; and training an MoE model based on the communication group. By use of the solutions of the present disclosure, normal operation of model training can be guaranteed.