-
公开(公告)号:US20240070436A1
公开(公告)日:2024-02-29
申请号:US17900592
申请日:2022-08-31
Applicant: Huawei Technologies Co., Ltd.
Inventor: Hang XU , Lu HOU , Guansong LU , Minzhe NIU , Zhenguo LI , Runhui HUANG , Lewei YAO , Chunjing XU , Xiaodan LIANG
IPC: G06N3/04 , G06F40/284
CPC classification number: G06N3/0454 , G06F40/284
Abstract: A method is provided for data processing performed by a processing system. The method comprises determining a set of first tokens for first data and a set of second token for second data, each token comprising information associated with a segment of the respective data, determining pair-wise similarities between the set of first tokens and the set of second tokens, each pair comprising a first token in the set of first tokens and a second token in the set of second tokens, determining, for each first token in the set of first tokens, a maximum similarity based on the determined pair-wise similarities between the respective first token and the second tokens in the set of second tokens, and determining a first similarity between the first data and the second data by aggregating the maximum similarities corresponding to the first tokens in the set of first set of tokens.
-
公开(公告)号:US20240119268A1
公开(公告)日:2024-04-11
申请号:US18524523
申请日:2023-11-30
Applicant: HUAWEI TECHNOLOGIES CO., LTD.
Inventor: Lu HOU , Lifeng SHANG , Xin JIANG , Li QIAN
IPC: G06N3/048
CPC classification number: G06N3/048
Abstract: This disclosure relates to the field of artificial intelligence, and discloses a data processing method. The method includes: obtaining a transformer model including a target network layer and a target module; and processing to-be-processed data by using the transformer model, to obtain a data processing result. The target module is configured to: perform a target operation on a feature map output at the target network layer, to obtain an operation result, and fuse the operation result and the feature map output, to obtain an updated feature map output. In this disclosure, the target module is inserted into the transformer model, and the operation result generated by the target module and an input are fused, so that information carried in a feature map output by the target network layer of the transformer model is increased.
-
3.
公开(公告)号:US20240104346A1
公开(公告)日:2024-03-28
申请号:US17945978
申请日:2022-09-15
Applicant: Huawei Technologies Co., Ltd.
Inventor: Lu HOU , Chaofan TAO , Wei ZHANG , Lifeng SHANG , Xin JIANG , Qun LIU , Li QIAN
IPC: G06N3/04
CPC classification number: G06N3/0454
Abstract: A method is provided for quantizing a neural network model performed by a processing system. The method comprises determining a scaling factor based on a distribution of weights associated with the neural network model, determining quantized weights based on the scaling factor and the weights associated with the distribution, determining a training loss of the neural network model based on the quantized weights during training of the neural network model, and determining an updated scaling factor for the neural network model based on a gradient of the training loss.
-
公开(公告)号:US20240185086A1
公开(公告)日:2024-06-06
申请号:US18443052
申请日:2024-02-15
Applicant: HUAWEI TECHNOLOGIES CO., LTD.
Inventor: Lu HOU , Haoli Bai , Lifeng Shang , Xin Jiang , Li Qian
Abstract: This disclosure relates to the field of artificial intelligence, and provides model distillation methods and apparatuses. In an implementation, a method including: obtaining first input data and second input data from a second computing node, wherein the first input data is output data of the third sub-model, and the second input data is output data processed by the fourth sub-model, processing the first input data by using the first sub-model, to obtain a first intermediate output, processing the second input data by using the second sub-model, to obtain a second intermediate output, wherein the first intermediate output and the second intermediate output are used to determine a first gradient, and distilling the first sub-model based on the first gradient, to obtain an updated first sub-model.
-
公开(公告)号:US20230229912A1
公开(公告)日:2023-07-20
申请号:US18123768
申请日:2023-03-20
Applicant: Huawei Technologies Co., Ltd.
Inventor: Wei ZHANG , Lu HOU , Yichun YIN , Lifeng SHANG
IPC: G06N3/08
CPC classification number: G06N3/08
Abstract: A model compression method is provided, which can be applied to the field of artificial intelligence. The method includes: obtaining a first neural network model, a second neural network model, and a third neural network model; processing first to-be-processed data using the first neural network model, to obtain a first output; processing the first to-be-processed data using the third neural network model, to obtain a second output; determining a first target loss based on the first output and the second output, and updating the second neural network model based on the first target loss, to obtain an updated second neural network model; and compressing the updated second neural network model to obtain a target neural network model. The model generated based on the method has higher processing precision.
-
公开(公告)号:US20220383078A1
公开(公告)日:2022-12-01
申请号:US17882895
申请日:2022-08-08
Applicant: HUAWEI TECHNOLOGIES CO., LTD.
Inventor: Lu HOU , Lifeng SHANG , Xin JIANG
Abstract: In a data processing method, a processing device obtains a first neural network model and an available resource state of a terminal device, and determines a second neural network model based on the first neural network model and the available resource state. An appropriate model size is determined based on the available resource state, and a part of the first neural network model is selected, based on the determined model size, as the second neural network model on which data processing is to be performed.
-
-
-
-
-