Generating Natural Language Descriptions of Images

    公开(公告)号:US20200042866A1

    公开(公告)日:2020-02-06

    申请号:US16538712

    申请日:2019-08-12

    Applicant: Google LLC

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating descriptions of input images. One of the methods includes obtaining an input image; processing the input image using a first neural network to generate an alternative representation for the input image; and processing the alternative representation for the input image using a second neural network to generate a sequence of a plurality of words in a target natural language that describes the input image.

    Generating natural language descriptions of images

    公开(公告)号:US10417557B2

    公开(公告)日:2019-09-17

    申请号:US15856453

    申请日:2017-12-28

    Applicant: Google LLC

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating descriptions of input images. One of the methods includes obtaining an input image; processing the input image using a first neural network to generate an alternative representation for the input image; and processing the alternative representation for the input image using a second neural network to generate a sequence of a plurality of words in a target natural language that describes the input image.

    Controlling agents using scene memory data
    15.
    发明申请

    公开(公告)号:US20200160172A1

    公开(公告)日:2020-05-21

    申请号:US16602702

    申请日:2019-11-20

    Applicant: Google LLC

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for controlling an agent. One of the methods includes receiving a current observation characterizing a current state of the environment as of the time step; generating an embedding of the current observation; processing scene memory data comprising embeddings of prior observations received at prior time steps using an encoder neural network, wherein the encoder neural network is configured to apply an encoder self-attention mechanism to the scene memory data to generate an encoded representation of the scene memory data; processing the encoded representation of the scene memory data and the embedding of the current observation using a decoder neural network to generate an action selection output; and causing the agent to perform the selected action.

Patent Agency Ranking