Patent search ap:("DeepMind Technologies Limited") AND inv:"Sergio Gomez" Page 1

1.

发明公开
AUTOREGRESSIVELY GENERATING SEQUENCES OF DATA ELEMENTS DEFINING ACTIONS TO BE PERFORMED BY AN AGENT 审中-公开

公开(公告)号：US20240281654A1

公开(公告)日：2024-08-22

申请号：US18292165

申请日：2022-08-12

Applicant: DeepMind Technologies Limited

Inventor： Scott Ellison Reed , Konrad Zolna , Emilio Parisotto , Tom Erez , Alexander Novikov , Jack William Rae , Misha Man Ray Denil , Joao Ferdinando Gomes de Freitas , Oriol Vinyals , Sergio Gomez , Ashley Deloris Edwards , Jacob Bruce , Gabriel Barth-Maron

IPC: G06N3/08 , G06N3/04

CPC classification number: G06N3/08 , G06N3/04

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for selecting actions to be performed by an agent to interact with an environment using an action selection neural network. In one aspect, a method comprises, at each time step in a sequence of time steps: generating a current representation of a state of a task being performed by the agent in the environment as of the current time step as a sequence of data elements; autoregressively generating a sequence of data elements representing a current action to be performed by the agent at the current time step; and after autoregressively generating the sequence of data elements representing the current action, causing the agent to perform the current action at the current time step.

2.

发明公开
TRAINING REINFORCEMENT LEARNING AGENTS USING AUGMENTED TEMPORAL DIFFERENCE LEARNING 审中-公开

公开(公告)号：US20230376780A1

公开(公告)日：2023-11-23

申请号：US18029979

申请日：2021-10-01

Applicant: DeepMind Technologies Limited

Inventor： Caglar Gulcehre , Razvan Pascanu , Sergio Gomez

IPC: G06N3/092 , G06N3/0442

CPC classification number: G06N3/092 , G06N3/0442

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a neural network used to select actions performed by an agent interacting with an environment by performing actions that cause the environment to transition states. One of the methods includes maintaining a replay memory storing a plurality of transitions; selecting a plurality of transitions from the replay memory; and training the neural network on the plurality of transitions, comprising, for each transition: generating an initial Q value for the transition; determining a scaled Q value for the transition; determining a scaled temporal difference learning target for the transition; determining an error between the scaled temporal difference learning target and the scaled Q value; determining an update to the current values of the Q network parameters; and determining an update to the current value of the scaling term.

3.

发明申请
AUTOREGRESSIVELY GENERATING SEQUENCES OF DATA ELEMENTS DEFINING ACTIONS TO BE PERFORMED BY AN AGENT 有权

公开(公告)号：US20230061411A1

公开(公告)日：2023-03-02

申请号：US17410689

申请日：2021-08-24

Applicant: DeepMind Technologies Limited

Inventor： Tom Erez , Alexander Novikov , Emilio Parisotto , Jack William Rae , Konrad Zolna , Misha Man Ray Denil , Joao Ferdinando Gomes de Freitas , Oriol Vinyals , Scott Ellison Reed , Sergio Gomez , Ashley Deloris Edwards , Jacob Bruce , Gabriel Barth-Maron

IPC: G06N3/08 , G06N3/04

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for selecting actions to be performed by an agent to interact with an environment using an action selection neural network. In one aspect, a method comprises, at each time step in a sequence of time steps: generating a current representation of a state of a task being performed by the agent in the environment as of the current time step as a sequence of data elements; autoregressively generating a sequence of data elements representing a current action to be performed by the agent at the current time step; and after autoregressively generating the sequence of data elements representing the current action, causing the agent to perform the current action at the current time step.

Patent Agency Ranking