BLACK-BOX OPTIMIZATION USING NEURAL NETWORKS

    公开(公告)号:US20220292404A1

    公开(公告)日:2022-09-15

    申请号:US17830286

    申请日:2022-06-01

    Abstract: Methods and systems for determining an optimized setting for one or more process parameters of a machine learning training process. One of the methods includes processing a current network input using a recurrent neural network in accordance with first values of the network parameters to obtain a current network output, obtaining a measure of the performance of the machine learning training process with an updated setting defined by the current network output, and generating a new network input that comprises (i) the updated setting defined by the current network output and (ii) the measure of the performance of the training process with the updated setting defined by the current network output.

    TRAINING A NEURAL NETWORK TO CONTROL AN AGENT USING TASK-RELEVANT ADVERSARIAL IMITATION LEARNING

    公开(公告)号:US20220261639A1

    公开(公告)日:2022-08-18

    申请号:US17625361

    申请日:2020-07-16

    Abstract: A method is proposed of training a neural network to generate action data for controlling an agent to perform a task in an environment. The method includes obtaining, for each of a plurality of performances of the task, one or more first tuple datasets, each first tuple dataset comprising state data characterizing a state of the environment at a corresponding time during the performance of the task; and a concurrent process of training the neural network and a discriminator network. The training process comprises a plurality of neural network update steps and a plurality of discriminator network update steps. Each neural network update step comprises: receiving state data characterizing a current state of the environment; using the neural network and the state data to generate action data indicative of an action to be performed by the agent; forming a second tuple dataset comprising the state data; using the second tuple dataset to generate a reward value, wherein the reward value comprises an imitation value generated by the discriminator network based on the second tuple dataset; and updating one or more parameters of the neural network based on the reward value. Each discriminator network update step comprises updating the discriminator network based on a plurality of the first tuple datasets and a plurality of the second tuple datasets, the update being to increase respective imitation values which the discriminator network generates upon receiving any of the plurality of the first tuple datasets compared to respective imitation values which the discriminator network generates upon receiving any of the plurality of the second tuple datasets. The updating process is performed subject to a constraint that the updated discriminator network, upon receiving any of at least a certain proportion of a first subset of the first tuple datasets and/or any of at least a certain proportion of a second subset of the second tuple datasets, does not generate imitation values which correctly indicate that those tuple datasets are first or second tuple datasets.

    CROSS-MODAL SEQUENCE DISTILLATION
    26.
    发明申请

    公开(公告)号:US20200160843A1

    公开(公告)日:2020-05-21

    申请号:US16687558

    申请日:2019-11-18

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a video speech recognition model having a plurality of model parameters on a set of unlabeled video-audio data and using a trained speech recognition model. During the training, the values of the parameters of the trained audio speech recognition model fixed are generally fixed and only the values of the video speech recognition model are adjusted. Once being trained, the video speech recognition model can be used to recognize speech from video when corresponding audio is not available.

    DUELING DEEP NEURAL NETWORKS
    27.
    发明申请

    公开(公告)号:US20180260689A1

    公开(公告)日:2018-09-13

    申请号:US15977913

    申请日:2018-05-11

    Abstract: Systems, methods, and apparatus, including computer programs encoded on a computer storage medium, for selecting an actions from a set of actions to be performed by an agent interacting with an environment. In one aspect, the system includes a dueling deep neural network. The dueling deep neural network includes a value subnetwork, an advantage subnetwork, and a combining layer. The value subnetwork processes a representation of an observation to generate a value estimate. The advantage subnetwork processes the representation of the observation to generate an advantage estimate for each action in the set of actions. The combining layer combines the value estimate and the respective advantage estimate for each action to generate a respective Q value for the action. The system selects an action to be performed by the agent in response to the observation using the respective Q values for the actions in the set of actions.

    Black-box optimization using neural networks

    公开(公告)号:US12008445B2

    公开(公告)日:2024-06-11

    申请号:US17830286

    申请日:2022-06-01

    CPC classification number: G06N20/00 G06F17/18 G06N3/04 G06N3/08

    Abstract: Methods and systems for determining an optimized setting for one or more process parameters of a machine learning training process. One of the methods includes processing a current network input using a recurrent neural network in accordance with first values of the network parameters to obtain a current network output, obtaining a measure of the performance of the machine learning training process with an updated setting defined by the current network output, and generating a new network input that comprises (i) the updated setting defined by the current network output and (ii) the measure of the performance of the training process with the updated setting defined by the current network output.

    NEURAL PROGRAMMING
    30.
    发明公开
    NEURAL PROGRAMMING 审中-公开

    公开(公告)号:US20240177001A1

    公开(公告)日:2024-05-30

    申请号:US18497924

    申请日:2023-10-30

    CPC classification number: G06N3/08 G06N3/044 G06N20/00

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for neural programming. One of the methods includes processing a current neural network input using a core recurrent neural network to generate a neural network output; determining, from the neural network output, whether or not to end a currently invoked program and to return to a calling program from the set of programs; determining, from the neural network output, a next program to be called; determining, from the neural network output, contents of arguments to the next program to be called; receiving a representation of a current state of the environment; and generating a next neural network input from an embedding for the next program to be called and the representation of the current state of the environment.

Patent Agency Ranking