Patent search ap:("Huawei Technologies Co. Page Ltd.") AND inv:"Yucong ZHOU"

1.

发明公开
NEURAL NETWORK MODEL OPTIMIZATION METHOD AND RELATED DEVICE 审中-公开

公开(公告)号：US20240249115A1

公开(公告)日：2024-07-25

申请号：US18605951

申请日：2024-03-15

Applicant: HUAWEI TECHNOLOGIES CO., LTD.

Inventor： Yunxiao SUN , Yucong ZHOU , Zhao ZHONG

IPC: G06N3/045 , G06N3/084

CPC classification number: G06N3/045 , G06N3/084

Abstract: An input of an optimized query Query feature transformation module is obtained based on an output feature of at least one previous network layer of the optimized attention layer. An input of an optimized key Key feature transformation module is obtained based on an output feature of at least one previous network layer of the optimized attention layer. An input of an optimized value Value feature transformation module is obtained based on an output feature of at least one previous network layer of the optimized attention layer. An input of at least one feature transformation module in the optimized query Query feature transformation module, the optimized key Key feature transformation module, and the optimized value Value feature transformation module is obtained based on an output feature of at least one non-adjacent previous network layer of the optimized attention layer.

2.

发明公开
CLASSIFICATION MODEL TRAINING METHOD, HYPERPARAMETER SEARCH METHOD, AND APPARATUS 审中-公开

公开(公告)号：US20230186103A1

公开(公告)日：2023-06-15

申请号：US18165083

申请日：2023-02-06

Applicant: Huawei Technologies Co., Ltd.

Inventor： Yucong ZHOU , Zhao ZHONG

IPC: G06N3/0985 , G06N3/084

CPC classification number: G06N3/0985 , G06N3/084

Abstract: This application relates to the field of artificial intelligence technologies, and describes a classification model training method, a hyperparameter search method, and an apparatus. The training method includes obtaining a target hyperparameter of a to-be-trained classification model. The target hyperparameter is used to control a gradient update operation of the to-be-trained classification model. The to-be-trained classification model includes a scaling invariance linear layer. The scaling invariance linear layer enables a predicted classification result output when a weight parameter of the to-be-trained classification model is multiplied by any scaling coefficient to remain unchanged. The method further includes updating the weight parameter of the to-be-trained classification model based on the target hyperparameter and a target training manner, to obtain a trained classification model.

3.

发明公开
NEURAL NETWORK MODEL TRAINING METHOD, DATA PROCESSING METHOD, AND APPARATUS 审中-公开

公开(公告)号：US20240078428A1

公开(公告)日：2024-03-07

申请号：US18354744

申请日：2023-07-19

Applicant: HUAWEI TECHNOLOGIES CO., LTD.

Inventor： Yucong ZHOU , Zezhou ZHU , Zhao ZHONG

IPC: G06N3/08 , G06N3/048

CPC classification number: G06N3/08 , G06N3/048

Abstract: A neural network model training method, a data processing method, and an apparatus are disclosed. The neural network model training method includes: training a neural network model based on training data, where an activation function of the neural network model includes at least one piecewise function, and the piecewise function includes a plurality of trainable parameters; and updating the plurality of trainable parameters of the at least one piecewise function in a training process. According to the method, the activation function suitable for the neural network model can be obtained. This can improve performance of the neural network model.

4.

发明公开
MODEL TRAINING METHOD AND APPARATUS 审中-公开

公开(公告)号：US20230385642A1

公开(公告)日：2023-11-30

申请号：US18446294

申请日：2023-08-08

Applicant: HUAWEI TECHNOLOGIES CO., LTD.

Inventor： Yucong ZHOU , Zhao ZHONG

IPC: G06N3/08 , G06N3/0464

CPC classification number: G06N3/08 , G06N3/0464

Abstract: This application discloses a model training method, which may be applied to the field of artificial intelligence. The method includes: obtaining a first neural network model; replacing a first convolutional layer in the first neural network model with a linear operation to obtain a plurality of second neural network models; and performing model training on a plurality of second neural network models, to obtain a neural network model with a highest model precision in a plurality of trained second neural network models. In this application, a convolutional layer in a to-be-trained neural network is replaced with a linear operation equivalent to a convolutional layer. A manner with highest precision is selected from a plurality of replacement manners, to improve precision of a trained model.

Patent Agency Ranking