NEURAL NETWORK OBTAINING METHOD, DATA PROCESSING METHOD, AND RELATED DEVICE

    公开(公告)号:US20240232575A1

    公开(公告)日:2024-07-11

    申请号:US18618100

    申请日:2024-03-27

    CPC classification number: G06N3/04

    Abstract: A neural network obtaining method, a data processing method, and a related device are disclosed. The disclosed methods may be used in the field of automatic neural architecture search technologies in the field of artificial intelligence. An example method includes: obtaining first indication information, where the first indication information indicates a probability and/or a quantity of times that k neural network modules appear in a first neural architecture cell; generating the first neural architecture cell based on the first indication information, and generating a first neural network; obtaining a target score corresponding to the first indication information, where the target score indicates performance of the first neural network; and obtaining second indication information from a plurality of pieces of first indication information based on a plurality of target scores, and obtaining a target neural network corresponding to the second indication information.

    LARGE MODEL EMULATION BY KNOWLEDGE DISTILLATION BASED NAS

    公开(公告)号:US20230237337A1

    公开(公告)日:2023-07-27

    申请号:US18193815

    申请日:2023-03-31

    CPC classification number: G06N3/082

    Abstract: Described herein is a machine learning mechanism implemented by one or more computers, the mechanism having access to a base neural network and being configured to determine a simplified neural network by iteratively performing the following set of steps: forming sample data by sampling the architecture of a current candidate neural network; selecting, in dependence on the sample data, an architecture for a second candidate neural network; forming a trained candidate neural network by training the second candidate neural network, wherein the training of the second candidate neural network comprises applying feedback to the second candidate neural network in dependence on a comparison of the behaviours of the second candidate neural network and the base neural network; and adopting the trained candidate neural network as the current candidate neural network for a subsequent iteration of the set of steps.

    METHOD AND APPARATUS FOR SEARCHING FOR NEURAL NETWORK ENSEMBLE MODEL, AND ELECTRONIC DEVICE

    公开(公告)号:US20240311651A1

    公开(公告)日:2024-09-19

    申请号:US18668637

    申请日:2024-05-20

    CPC classification number: G06N3/0985 G06N3/04

    Abstract: Disclosed is a method for searching for a neural network architecture ensemble model. The method includes: obtaining a dataset, where the dataset includes a sample and an annotation in a classification task; performing search by using a distributional neural network architecture search algorithm, including: determining a hyperparameter of a neural network architecture distribution; sampling a valid neural network architecture from the architecture distribution defined by the hyperparameter; training and evaluating the neural network architecture on the dataset, to obtain a performance indicator; determining, based on the performance indicator, neural network architecture distributions that share the hyperparameter, to obtain a candidate pool of base learners; and determining a surrogate model; and predicting test performance of the base learner in the candidate pool by using the surrogate model, and determining that k diverse base learners that meet a task scenario requirement form an ensemble model.

Patent Agency Ranking