NEURAL NETWORK MODEL OPTIMIZATION METHOD AND RELATED DEVICE

    公开(公告)号:US20240249115A1

    公开(公告)日:2024-07-25

    申请号:US18605951

    申请日:2024-03-15

    CPC classification number: G06N3/045 G06N3/084

    Abstract: An input of an optimized query Query feature transformation module is obtained based on an output feature of at least one previous network layer of the optimized attention layer. An input of an optimized key Key feature transformation module is obtained based on an output feature of at least one previous network layer of the optimized attention layer. An input of an optimized value Value feature transformation module is obtained based on an output feature of at least one previous network layer of the optimized attention layer. An input of at least one feature transformation module in the optimized query Query feature transformation module, the optimized key Key feature transformation module, and the optimized value Value feature transformation module is obtained based on an output feature of at least one non-adjacent previous network layer of the optimized attention layer.

    CLASSIFICATION MODEL TRAINING METHOD, HYPERPARAMETER SEARCH METHOD, AND APPARATUS

    公开(公告)号:US20230186103A1

    公开(公告)日:2023-06-15

    申请号:US18165083

    申请日:2023-02-06

    CPC classification number: G06N3/0985 G06N3/084

    Abstract: This application relates to the field of artificial intelligence technologies, and describes a classification model training method, a hyperparameter search method, and an apparatus. The training method includes obtaining a target hyperparameter of a to-be-trained classification model. The target hyperparameter is used to control a gradient update operation of the to-be-trained classification model. The to-be-trained classification model includes a scaling invariance linear layer. The scaling invariance linear layer enables a predicted classification result output when a weight parameter of the to-be-trained classification model is multiplied by any scaling coefficient to remain unchanged. The method further includes updating the weight parameter of the to-be-trained classification model based on the target hyperparameter and a target training manner, to obtain a trained classification model.

    NEURAL NETWORK MODEL TRAINING METHOD, DATA PROCESSING METHOD, AND APPARATUS

    公开(公告)号:US20240078428A1

    公开(公告)日:2024-03-07

    申请号:US18354744

    申请日:2023-07-19

    CPC classification number: G06N3/08 G06N3/048

    Abstract: A neural network model training method, a data processing method, and an apparatus are disclosed. The neural network model training method includes: training a neural network model based on training data, where an activation function of the neural network model includes at least one piecewise function, and the piecewise function includes a plurality of trainable parameters; and updating the plurality of trainable parameters of the at least one piecewise function in a training process. According to the method, the activation function suitable for the neural network model can be obtained. This can improve performance of the neural network model.

    MODEL TRAINING METHOD AND APPARATUS
    4.
    发明公开

    公开(公告)号:US20230385642A1

    公开(公告)日:2023-11-30

    申请号:US18446294

    申请日:2023-08-08

    CPC classification number: G06N3/08 G06N3/0464

    Abstract: This application discloses a model training method, which may be applied to the field of artificial intelligence. The method includes: obtaining a first neural network model; replacing a first convolutional layer in the first neural network model with a linear operation to obtain a plurality of second neural network models; and performing model training on a plurality of second neural network models, to obtain a neural network model with a highest model precision in a plurality of trained second neural network models. In this application, a convolutional layer in a to-be-trained neural network is replaced with a linear operation equivalent to a convolutional layer. A manner with highest precision is selected from a plurality of replacement manners, to improve precision of a trained model.

Patent Agency Ranking