SELECTING ACTIONS USING MULTI-MODAL INPUTS

    公开(公告)号:US20210110115A1

    公开(公告)日:2021-04-15

    申请号:US16497602

    申请日:2018-06-05

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for selecting actions to be performed by an agent interacting with an environment. In one aspect, a system includes a language encoder model that is configured to receive a text string in a particular natural language, and process the text string to generate a text embedding of the text string. The system includes an observation encoder neural network that is configured to receive an observation characterizing a state of the environment, and process the observation to generate an observation embedding of the observation. The system includes a subsystem that is configured to obtain a current text embedding of a current text string and a current observation embedding of a current observation. The subsystem is configured to select an action to be performed by the agent in response to the current observation.

    SEQUENCE TRANSDUCTION NEURAL NETWORKS
    2.
    发明申请

    公开(公告)号:US20200151398A1

    公开(公告)日:2020-05-14

    申请号:US16746012

    申请日:2020-01-17

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating a target sequence from an input sequence. In one aspect, a method comprises maintaining a set of current hypotheses, wherein each current hypothesis comprises an input prefix and an output prefix. For each possible combination of input and output prefix length, the method extends any current hypothesis that could reach the possible combination to generate respective extended hypotheses for each such current hypothesis; determines a respective direct score for each extended hypothesis using a direct model; determines a first number of highest-scoring hypotheses according to the direct scores; rescores the first number of highest-scoring hypotheses using a noisy channel model to generate a reduced number of hypotheses; and adds the reduced number of hypotheses to the set of current hypotheses.

    Sequence transduction neural networks

    公开(公告)号:US10572603B2

    公开(公告)日:2020-02-25

    申请号:US16403281

    申请日:2019-05-03

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating a target sequence from an input sequence. In one aspect, a method comprises maintaining a set of current hypotheses, wherein each current hypothesis comprises an input prefix and an output prefix. For each possible combination of input and output prefix length, the method extends any current hypothesis that could reach the possible combination to generate respective extended hypotheses for each such current hypothesis; determines a respective direct score for each extended hypothesis using a direct model; determines a first number of highest-scoring hypotheses according to the direct scores; rescores the first number of highest-scoring hypotheses using a noisy channel model to generate a reduced number of hypotheses; and adds the reduced number of hypotheses to the set of current hypotheses.

    ACTION SELECTION BASED ON ENVIRONMENT OBSERVATIONS AND TEXTUAL INSTRUCTIONS

    公开(公告)号:US20220318516A1

    公开(公告)日:2022-10-06

    申请号:US17744921

    申请日:2022-05-16

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for selecting actions to be performed by an agent interacting with an environment. In one aspect, a system includes a language encoder model that is configured to receive a text string in a particular natural language, and process the text string to generate a text embedding of the text string. The system includes an observation encoder neural network that is configured to receive an observation characterizing a state of the environment, and process the observation to generate an observation embedding of the observation. The system includes a subsystem that is configured to obtain a current text embedding of a current text string and a current observation embedding of a current observation. The subsystem is configured to select an action to be performed by the agent in response to the current observation.

    Sequence transduction neural networks

    公开(公告)号:US11423237B2

    公开(公告)日:2022-08-23

    申请号:US16746012

    申请日:2020-01-17

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating a target sequence from an input sequence. In one aspect, a method comprises maintaining a set of current hypotheses, wherein each current hypothesis comprises an input prefix and an output prefix. For each possible combination of input and output prefix length, the method extends any current hypothesis that could reach the possible combination to generate respective extended hypotheses for each such current hypothesis; determines a respective direct score for each extended hypothesis using a direct model; determines a first number of highest-scoring hypotheses according to the direct scores; rescores the first number of highest-scoring hypotheses using a noisy channel model to generate a reduced number of hypotheses; and adds the reduced number of hypotheses to the set of current hypotheses.

    Action selection based on environment observations and textual instructions

    公开(公告)号:US11354509B2

    公开(公告)日:2022-06-07

    申请号:US16497602

    申请日:2018-06-05

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for selecting actions to be performed by an agent interacting with an environment. In one aspect, a system includes a language encoder model that is configured to receive a text string in a particular natural language, and process the text string to generate a text embedding of the text string. The system includes an observation encoder neural network that is configured to receive an observation characterizing a state of the environment, and process the observation to generate an observation embedding of the observation. The system includes a subsystem that is configured to obtain a current text embedding of a current text string and a current observation embedding of a current observation. The subsystem is configured to select an action to be performed by the agent in response to the current observation.

    AUGMENTED RECURRENT NEURAL NETWORK WITH EXTERNAL MEMORY

    公开(公告)号:US20200005147A1

    公开(公告)日:2020-01-02

    申请号:US16565245

    申请日:2019-09-09

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for augmenting neural networks with an external memory. One of the methods includes providing an output derived from the neural network output for the time step as a system output for the time step; maintaining a current state of the external memory; determining, from the neural network output for the time step, memory state parameters for the time step; updating the current state of the external memory using the memory state parameters for the time step; reading data from the external memory in accordance with the updated state of the external memory; and combining the data read from the external memory with a system input for the next time step to generate the neural network input for the next time step.

    SEQUENCE TRANSDUCTION NEURAL NETWORKS
    8.
    发明申请

    公开(公告)号:US20190258718A1

    公开(公告)日:2019-08-22

    申请号:US16403281

    申请日:2019-05-03

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating a target sequence from an input sequence. In one aspect, a method comprises maintaining a set of current hypotheses, wherein each current hypothesis comprises an input prefix and an output prefix. For each possible combination of input and output prefix length, the method extends any current hypothesis that could reach the possible combination to generate respective extended hypotheses for each such current hypothesis; determines a respective direct score for each extended hypothesis using a direct model; determines a first number of highest-scoring hypotheses according to the direct scores; rescores the first number of highest-scoring hypotheses using a noisy channel model to generate a reduced number of hypotheses; and adds the reduced number of hypotheses to the set of current hypotheses.

    Action selection based on environment observations and textual instructions

    公开(公告)号:US12265795B2

    公开(公告)日:2025-04-01

    申请号:US18649774

    申请日:2024-04-29

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for selecting actions to be performed by an agent interacting with an environment. In one aspect, a system includes a language encoder model that is configured to receive a text string in a particular natural language, and process the text string to generate a text embedding of the text string. The system includes an observation encoder neural network that is configured to receive an observation characterizing a state of the environment, and process the observation to generate an observation embedding of the observation. The system includes a subsystem that is configured to obtain a current text embedding of a current text string and a current observation embedding of a current observation. The subsystem is configured to select an action to be performed by the agent in response to the current observation.

    ACTION SELECTION BASED ON ENVIRONMENT OBSERVATIONS AND TEXTUAL INSTRUCTIONS

    公开(公告)号:US20240320438A1

    公开(公告)日:2024-09-26

    申请号:US18649774

    申请日:2024-04-29

    CPC classification number: G06F40/30 G06F17/16 G06N3/08

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for selecting actions to be performed by an agent interacting with an environment. In one aspect, a system includes a language encoder model that is configured to receive a text string in a particular natural language, and process the text string to generate a text embedding of the text string. The system includes an observation encoder neural network that is configured to receive an observation characterizing a state of the environment, and process the observation to generate an observation embedding of the observation. The system includes a subsystem that is configured to obtain a current text embedding of a current text string and a current observation embedding of a current observation. The subsystem is configured to select an action to be performed by the agent in response to the current observation.

Patent Agency Ranking