-
公开(公告)号:US20250139327A1
公开(公告)日:2025-05-01
申请号:US18895722
申请日:2024-09-25
Inventor: Liang Shen , Jinle Zeng , Hongxiang Hao , Weibao Gong , Dianhai Yu , Haifeng Wang
IPC: G06F30/20
Abstract: A method for processing a model operator includes: determining an operator set for model networking, wherein the operator set comprises a plurality of operators; determining a storage amount occupied by an output tensor of each operator in the operator set and a computation time period consumed in a forward computation of each operator in the operator set; and determining a first operator participating in recomputation in a model from the operator set, based on the storage amounts and the computation time periods of the plurality of operators.
-
2.
公开(公告)号:US20250103959A1
公开(公告)日:2025-03-27
申请号:US18885339
申请日:2024-09-13
Inventor: Liang Shen , Dianhai Yu , Weibao Gong , Jinle Zeng , Haifeng Wang
IPC: G06N20/00
Abstract: Provided is a performance optimization method for a model training device, an electronic device, and a storage medium, relating to the fields of deep learning, large model training, and distributed parallel strategies. The method includes: determining communication timing of a current model training device with respect to a target model block at a target sorting position, so as to be able to perform synchronously collective communication with other model training devices of a plurality of model training devices with respect to model blocks at the target sorting position; and performing the collective communication on a backward gradient of the target model block at the communication timing.
-