NEURAL NETWORK MODEL OPTIMIZATION METHOD AND RELATED DEVICE

    公开(公告)号:US20240249115A1

    公开(公告)日:2024-07-25

    申请号:US18605951

    申请日:2024-03-15

    CPC classification number: G06N3/045 G06N3/084

    Abstract: An input of an optimized query Query feature transformation module is obtained based on an output feature of at least one previous network layer of the optimized attention layer. An input of an optimized key Key feature transformation module is obtained based on an output feature of at least one previous network layer of the optimized attention layer. An input of an optimized value Value feature transformation module is obtained based on an output feature of at least one previous network layer of the optimized attention layer. An input of at least one feature transformation module in the optimized query Query feature transformation module, the optimized key Key feature transformation module, and the optimized value Value feature transformation module is obtained based on an output feature of at least one non-adjacent previous network layer of the optimized attention layer.

    CLASSIFICATION MODEL TRAINING METHOD, HYPERPARAMETER SEARCH METHOD, AND APPARATUS

    公开(公告)号:US20230186103A1

    公开(公告)日:2023-06-15

    申请号:US18165083

    申请日:2023-02-06

    CPC classification number: G06N3/0985 G06N3/084

    Abstract: This application relates to the field of artificial intelligence technologies, and describes a classification model training method, a hyperparameter search method, and an apparatus. The training method includes obtaining a target hyperparameter of a to-be-trained classification model. The target hyperparameter is used to control a gradient update operation of the to-be-trained classification model. The to-be-trained classification model includes a scaling invariance linear layer. The scaling invariance linear layer enables a predicted classification result output when a weight parameter of the to-be-trained classification model is multiplied by any scaling coefficient to remain unchanged. The method further includes updating the weight parameter of the to-be-trained classification model based on the target hyperparameter and a target training manner, to obtain a trained classification model.

    NEURAL NETWORK STRUCTURE DETERMINING METHOD AND APPARATUS

    公开(公告)号:US20230289572A1

    公开(公告)日:2023-09-14

    申请号:US18316369

    申请日:2023-05-12

    CPC classification number: G06N3/0464 G06N3/084

    Abstract: A neural network structure determining method is disclosed. The method includes: obtaining a to-be-trained initial neural network, where the initial neural network includes M first blocks block and a second block, the second block is connected to each first block, and each first block corresponds to one trainable target weight; performing model training on the initial neural network, to obtain M updated target weights; and updating a connection relationship between the second block and the M first blocks in the initial neural network based on the M updated target weights, to obtain a first neural network.

    DATA PROCESSING METHOD, SYSTEM, AND APPARATUS

    公开(公告)号:US20230222639A1

    公开(公告)日:2023-07-13

    申请号:US18182655

    申请日:2023-03-13

    Abstract: This application provides a data processing method, system, and apparatus, and relates to the field of artificial intelligence (AI). The data processing method may be performed by a server, or may be performed by a device having a data processing function. During execution, reference data is first obtained. The reference data includes RGB image data and a device parameter of an image device. Then, a plurality of conversion parameters required for converting the RGB image data into RAW data are determined. Finally, the RGB image data is processed into the RAW data based on the plurality of conversion parameters. The RAW data matches the device parameter of the image device. In this application, the RGB image data is converted into the RAW data based on the plurality of conversion parameters rather than manual experience. Therefore, the described data processing method, system, and apparatus improve data processing efficiency.

    NEURAL NETWORK CONSTRUCTION METHOD AND APPARATUS, AND IMAGE PROCESSING METHOD AND APPARATUS

    公开(公告)号:US20220222934A1

    公开(公告)日:2022-07-14

    申请号:US17700098

    申请日:2022-03-21

    Abstract: This application discloses a neural network construction method and apparatus, and an image processing method and apparatus in the field of artificial intelligence. The neural network construction method includes: constructing search space based on an application requirement of a target neural network, where the search space includes M elements, the M elements are used to indicate M network structures, each of the M elements includes a quantity of blocks in a stage in a corresponding network structure and a channel quantity of each block, and M is a positive integer (S710); and selecting a target network structure from the M network structures based on a distribution relationship among unevaluated elements in the search space (S720). According to the method, a neural network satisfying a performance requirement can be efficiently constructed.

    NEURAL NETWORK MODEL TRAINING METHOD, DATA PROCESSING METHOD, AND APPARATUS

    公开(公告)号:US20240078428A1

    公开(公告)日:2024-03-07

    申请号:US18354744

    申请日:2023-07-19

    CPC classification number: G06N3/08 G06N3/048

    Abstract: A neural network model training method, a data processing method, and an apparatus are disclosed. The neural network model training method includes: training a neural network model based on training data, where an activation function of the neural network model includes at least one piecewise function, and the piecewise function includes a plurality of trainable parameters; and updating the plurality of trainable parameters of the at least one piecewise function in a training process. According to the method, the activation function suitable for the neural network model can be obtained. This can improve performance of the neural network model.

    MODEL TRAINING METHOD AND APPARATUS
    7.
    发明公开

    公开(公告)号:US20230385642A1

    公开(公告)日:2023-11-30

    申请号:US18446294

    申请日:2023-08-08

    CPC classification number: G06N3/08 G06N3/0464

    Abstract: This application discloses a model training method, which may be applied to the field of artificial intelligence. The method includes: obtaining a first neural network model; replacing a first convolutional layer in the first neural network model with a linear operation to obtain a plurality of second neural network models; and performing model training on a plurality of second neural network models, to obtain a neural network model with a highest model precision in a plurality of trained second neural network models. In this application, a convolutional layer in a to-be-trained neural network is replaced with a linear operation equivalent to a convolutional layer. A manner with highest precision is selected from a plurality of replacement manners, to improve precision of a trained model.

    NEURAL NETWORK MODEL UPDATE METHOD, IMAGE PROCESSING METHOD, AND APPARATUS

    公开(公告)号:US20220319154A1

    公开(公告)日:2022-10-06

    申请号:US17843310

    申请日:2022-06-17

    Abstract: This application discloses a neural network model update method, an image processing method, and an apparatus in the field of artificial intelligence. The neural network model update method includes: obtaining a structure of a neural network model and a related parameter of the neural network model; training the neural network model based on the related parameter of the neural network model to obtain a trained neural network model; and if an evaluation result of the trained neural network model does not meet a preset condition, updating at least two items of the related parameter of the neural network model and the structure of the neural network model until an evaluation result of an updated neural network model meets a preset condition and/or a quantity of updates reaches a preset quantity of times. According to the method in this application, efficiency of updating a neural network model can be improved.

    NEURAL NETWORK TRAINING METHOD, DATA PROCESSING METHOD, AND RELATED APPARATUS

    公开(公告)号:US20220215259A1

    公开(公告)日:2022-07-07

    申请号:US17701101

    申请日:2022-03-22

    Abstract: Technical solutions in this application are applied to the field of artificial intelligence. This application provides a neural network training method, a method for performing data processing by using a neural network trained by using the method, and a related apparatus. According to the training method in this application, a target neural network is trained in an adversarial manner, so that a policy search module can continuously discover a weakness of the target neural network, generate a policy of higher quality according to the weakness, and perform data augmentation according to the policy to obtain data of higher quality. A target neural network of higher quality can be trained according to the data. In the data processing method in this application, data processing is performed by using the foregoing target neural network, so that a more accurate processing result can be obtained.

Patent Agency Ranking