SELECTING REINFORCEMENT LEARNING ACTIONS USING GOALS AND OBSERVATIONS
    1.
    发明申请
    SELECTING REINFORCEMENT LEARNING ACTIONS USING GOALS AND OBSERVATIONS 审中-公开
    使用目标和观察选择加强学习行动

    公开(公告)号:US20160292568A1

    公开(公告)日:2016-10-06

    申请号:US15091840

    申请日:2016-04-06

    Applicant: Google Inc.

    CPC classification number: G06N3/08 G06N3/0454 G06N20/00

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for reinforcement learning using goals and observations. One of the methods includes receiving an observation characterizing a current state of the environment; receiving a goal characterizing a target state from a set of target states of the environment; processing the observation using an observation neural network to generate a numeric representation of the observation; processing the goal using a goal neural network to generate a numeric representation of the goal; combining the numeric representation of the observation and the numeric representation of the goal to generate a combined representation; processing the combined representation using an action score neural network to generate a respective score for each action in the predetermined set of actions; and selecting the action to be performed using the respective scores for the actions in the predetermined set of actions.

    Abstract translation: 方法,系统和装置,包括在计算机存储介质上编码的计算机程序,用于使用目标和观测来加强学习。 其中一种方法包括接收表征当前环境状态的观测值; 从环境的一组目标状态接收表征目标状态的目标; 使用观察神经网络处理观测以产生观察的数字表示; 使用目标神经网络处理目标以生成目标的数字表示; 组合观察的数字表示和目标的数字表示以生成组合表示; 使用动作评分神经网络处理所述组合表示以针对所述预定动作组中的每个动作生成相应的分数; 以及使用预定动作集中的动作的各个分数来选择要执行的动作。

    COMPRESSING IMAGES USING NEURAL NETWORKS
    2.
    发明申请

    公开(公告)号:US20170230675A1

    公开(公告)日:2017-08-10

    申请号:US15396332

    申请日:2016-12-30

    Applicant: Google Inc.

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for compressing images using neural networks. One of the methods includes receiving an image; processing the image using an encoder neural network, wherein the encoder neural network is configured to receive the image and to process the image to generate an output defining values of a first number of latent variables that each represent a feature of the image; generating a compressed representation of the image using the output defining the values of the first number of latent variables; and providing the compressed representation of the image for use in generating a reconstruction of the image.

    RECURRENT NEURAL NETWORKS FOR DATA ITEM GENERATION
    3.
    发明申请
    RECURRENT NEURAL NETWORKS FOR DATA ITEM GENERATION 审中-公开
    数据项生成的神经网络

    公开(公告)号:US20160232440A1

    公开(公告)日:2016-08-11

    申请号:US15016160

    申请日:2016-02-04

    Applicant: Google Inc.

    CPC classification number: G06N3/04 G06N3/0445 G06N3/0454 G10L13/02 G10L25/30

    Abstract: Methods, and systems, including computer programs encoded on computer storage media for generating data items. A method includes reading a glimpse from a data item using a decoder hidden state vector of a decoder for a preceding time step, providing, as input to a encoder, the glimpse and decoder hidden state vector for the preceding time step for processing, receiving, as output from the encoder, a generated encoder hidden state vector for the time step, generating a decoder input from the generated encoder hidden state vector, providing the decoder input to the decoder for processing, receiving, as output from the decoder, a generated a decoder hidden state vector for the time step, generating a neural network output update from the decoder hidden state vector for the time step, and combining the neural network output update with a current neural network output to generate an updated neural network output.

    Abstract translation: 方法和系统,包括在用于生成数据项的计算机存储介质上编码的计算机程序。 一种方法包括使用解码器的解码器隐藏状态向量从数据项中读取前一时间步长,向编码器提供前一时间步长的瞥见和解码器隐藏状态向量的输入,用于处理,接收, 作为编码器的输出,生成用于时间步长的编码器隐藏状态向量,从所生成的编码器隐藏状态矢量生成解码器输入,向解码器提供解码器输入,以处理,接收来自解码器的输出,生成的 用于时间步长的解码器隐藏状态向量,从时间步骤的解码器隐藏状态向量生成神经网络输出更新,并且将神经网络输出更新与当前神经网络输出组合以生成更新的神经网络输出。

Patent Agency Ranking