-
公开(公告)号:US20230169351A1
公开(公告)日:2023-06-01
申请号:US18060705
申请日:2022-12-01
Inventor: Haifeng Wang , Zhihua Wu , Dianhai Yu , Yanjun Ma , Tian Wu
IPC: G06N3/098
CPC classification number: G06N3/098
Abstract: A distributed training method based on end-to-end adaption, a device and a storage medium. The method includes: obtaining slicing results by slicing a model to be trained; obtaining an attribute of computing resources allocated to the model for training by parsing the computing resources, in which the computing resources are determined based on a computing resource requirement of the model, computing resources occupied by another model being trained, and idle computing resources, and the attribute of the computing resources is configured to represent at least one of a topology relation and a task processing capability of the computing resources; determining a distribution strategy of each of the slicing results in the computing resources based on the attributes of the computing resources; and performing distributed training on the model using the computing resources based on the distribution strategy.
-
公开(公告)号:US20230115163A1
公开(公告)日:2023-04-13
申请号:US17989644
申请日:2022-11-17
Inventor: Haifeng Wang , Xiaoguang Hu , Dianhai Yu , Xiang Lan , Yanjun Ma
Abstract: The disclosure provides a method for processing data, and an electronic device. The method includes: obtaining first attribute information of input data and second attribute information of a computing device corresponding to the input data; selecting a target operator implementation mode from a plurality of candidate operator implementation modes based on the first attribute information and the second attribute information; determining a plurality of sub-operators included in an operator required for the input data from an operator library based on the target operator implementation mode, to generate the operator; and obtaining an operation result by performing an operation on the input data by the computing device based on the operator.
-
公开(公告)号:US11620815B2
公开(公告)日:2023-04-04
申请号:US17938457
申请日:2022-10-06
Inventor: Guanghua Yu , Qingqing Dang , Haoshuang Wang , Guanzhong Wang , Xiaoguang Hu , Dianhai Yu , Yanjun Ma , Qiwen Liu , Can Wen
IPC: G06V10/77 , G06V10/82 , G06V10/80 , G06V10/764
Abstract: A method for detecting an object in an image includes: obtaining an image to be detected; generating a plurality of feature maps based on the image to be detected by a plurality of feature extracting networks in a neural network model trained for object detection, in which the plurality of feature extracting networks are connected sequentially, and input data of a latter feature extracting network in the plurality of feature extracting networks is based on output data and input data of a previous feature extracting network; and generating an object detection result based on the plurality of feature maps by an object detecting network in the neural network model.
-
14.
公开(公告)号:US20250103959A1
公开(公告)日:2025-03-27
申请号:US18885339
申请日:2024-09-13
Inventor: Liang Shen , Dianhai Yu , Weibao Gong , Jinle Zeng , Haifeng Wang
IPC: G06N20/00
Abstract: Provided is a performance optimization method for a model training device, an electronic device, and a storage medium, relating to the fields of deep learning, large model training, and distributed parallel strategies. The method includes: determining communication timing of a current model training device with respect to a target model block at a target sorting position, so as to be able to perform synchronously collective communication with other model training devices of a plurality of model training devices with respect to model blocks at the target sorting position; and performing the collective communication on a backward gradient of the target model block at the communication timing.
-
公开(公告)号:US12205025B2
公开(公告)日:2025-01-21
申请号:US17211146
申请日:2021-03-24
Inventor: Haifeng Wang , Xiaoguang Hu , Dianhai Yu
Abstract: The present application discloses a processor video memory optimization method and apparatus for deep learning training tasks, and relates to the technical field of artificial intelligence. In the method, by determining an optimal path for transferring a computing result, the computing result of a first computing unit is transferred to a second computing unit by using the optimal path. Thus, occupying the video memory is avoided, and meanwhile, a problem of low utilization rate of the computing unit of a GPU caused by video memory swaps is avoided, so that training speed of most tasks is hardly reduced.
-
公开(公告)号:US20230186024A1
公开(公告)日:2023-06-15
申请号:US17874394
申请日:2022-07-27
Inventor: Zeyu Chen , Haifeng Wang , Tian Wu , Dianhai Yu , Yanjun Ma , Xiaoguang Hu
IPC: G06F40/284 , G06F40/47
CPC classification number: G06F40/284 , G06F40/47
Abstract: Provided are a text processing method, a device and a storage medium, relating to a field of computer technology, and especially to a field of artificial intelligence, such as natural language processing and deep learning. The specific implementation scheme includes: performing text processing on first text, by using a text processing acceleration operator; and processing, in parallel and faster, content after the text processing, by using the text processing acceleration operator. Text processing and parallel acceleration are carried out by the text processing acceleration operator, which can improve the speed of text processing.
-
17.
公开(公告)号:US11604774B2
公开(公告)日:2023-03-14
申请号:US17480294
申请日:2021-09-21
Inventor: Liujie Zhang , Yamei Li , Huihuang Zheng , Hongyu Liu , Xiang Lan , Dianhai Yu , Yanjun Ma , Tian Wu , Haifeng Wang
Abstract: A method and apparatus of converting a schema in a deep learning framework, an electronic device, and a computer storage medium are provided. The method of converting the schema in the deep learning framework includes: updating a first schema, based on first syntax elements in the first schema and a context relationship between the first syntax elements in the first schema, so as to obtain an updated first schema; generating second syntax elements corresponding to updated first syntax elements in the updated first schema, based on a mapping relationship between the updated first syntax elements in the updated first schema and second syntax elements in a second schema system; and combining the second syntax elements according to a context relationship between the updated first syntax elements, so as to generate a second schema.
-
-
-
-
-
-