Selecting actions by reverting to previous learned action selection policies

    公开(公告)号:US11423300B1

    公开(公告)日:2022-08-23

    申请号:US16271533

    申请日:2019-02-08

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating a system output using a remembered value of a neural network hidden state. In one aspect, a system comprises an external memory that maintains context experience tuples respectively comprising: (i) a key embedding of context data, and (ii) a value of a hidden state of a neural network at the respective previous time step. The neural network is configured to receive a system input and a remembered value of the hidden state of the neural network and to generate a system output. The system comprises a memory interface subsystem that is configured to determine a key embedding for current context data, determine a remembered value of the hidden state of the neural network based on the key embedding, and provide the remembered value of the hidden state as an input to the neural network.

    Progressive neural networks
    34.
    发明授权

    公开(公告)号:US10949734B2

    公开(公告)日:2021-03-16

    申请号:US15396319

    申请日:2016-12-30

    Abstract: Methods and systems for performing a sequence of machine learning tasks. One system includes a sequence of deep neural networks (DNNs), including: a first DNN corresponding to a first machine learning task, wherein the first DNN comprises a first plurality of indexed layers, and each layer in the first plurality of indexed layers is configured to receive a respective layer input and process the layer input to generate a respective layer output; and one or more subsequent DNNs corresponding to one or more respective machine learning tasks, wherein each subsequent DNN comprises a respective plurality of indexed layers, and each layer in a respective plurality of indexed layers with index greater than one receives input from a preceding layer of the respective subsequent DNN, and one or more preceding layers of respective preceding DNNs, wherein a preceding layer is a layer whose index is one less than the current index.

    Whitened neural network layers
    39.
    发明授权

    公开(公告)号:US10762421B2

    公开(公告)日:2020-09-01

    申请号:US15174020

    申请日:2016-06-06

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing inputs using a neural network system that includes a whitened neural network layer. One of the methods includes receiving an input activation generated by a layer before the whitened neural network layer in the sequence; processing the received activation in accordance with a set of whitening parameters to generate a whitened activation; processing the whitened activation in accordance with a set of layer parameters to generate an output activation; and providing the output activation as input to a neural network layer after the whitened neural network layer in the sequence.

    NEURAL NETWORKS FOR SELECTING ACTIONS TO BE PERFORMED BY A ROBOTIC AGENT

    公开(公告)号:US20200223063A1

    公开(公告)日:2020-07-16

    申请号:US16829237

    申请日:2020-03-25

    Abstract: A system includes a neural network system implemented by one or more computers. The neural network system is configured to receive an observation characterizing a current state of a real-world environment being interacted with by a robotic agent to perform a robotic task and to process the observation to generate a policy output that defines an action to be performed by the robotic agent in response to the observation. The neural network system includes: (i) a sequence of deep neural networks (DNNs), in which the sequence of DNNs includes a simulation-trained DNN that has been trained on interactions of a simulated version of the robotic agent with a simulated version of the real-world environment to perform a simulated version of the robotic task, and (ii) a first robot-trained DNN that is configured to receive the observation and to process the observation to generate the policy output.

Patent Agency Ranking