Patent search ap:("DeepMind Technologies Limited") AND inv:"Alexander Vezhnevets" Page 1

1.

发明申请
ACTION SELECTION FOR REINFORCEMENT LEARNING USING A MANAGER NEURAL NETWORK THAT GENERATES GOAL VECTORS DEFINING AGENT OBJECTIVES 有权

公开(公告)号：US20230090824A1

公开(公告)日：2023-03-23

申请号：US18072175

申请日：2022-11-30

Applicant: DeepMind Technologies Limited

Inventor： Simon Osindero , Koray Kavukcuoglu , Alexander Vezhnevets

IPC: G06N3/08 , G06N3/04

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for a system configured to select actions to be performed by an agent that interacts with an environment. The system comprises a manager neural network subsystem and a worker neural network subsystem. The manager subsystem is configured to, at each of the multiple time steps, generate a final goal vector for the time step. The worker subsystem is configured to, at each of multiple time steps, use the final goal vector generated by the manager subsystem to generate a respective action score for each action in a predetermined set of actions.

2.

发明授权
Action selection for reinforcement learning using neural networks 有权

公开(公告)号：US10679126B2

公开(公告)日：2020-06-09

申请号：US16511571

申请日：2019-07-15

Applicant: DeepMind Technologies Limited

Inventor： Simon Osindero , Koray Kavukcuoglu , Alexander Vezhnevets

IPC: G06N3/08 , G06N3/04

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for a system configured to select actions to be performed by an agent that interacts with an environment. The system comprises a manager neural network subsystem and a worker neural network subsystem. The manager subsystem is configured to, at each of the multiple time steps, generate a final goal vector for the time step. The worker subsystem is configured to, at each of multiple time steps, use the final goal vector generated by the manager subsystem to generate a respective action score for each action in a predetermined set of actions.

3.

发明授权
Action selection for reinforcement learning using a manager neural network that generates goal vectors defining agent objectives 有权

公开(公告)号：US11537887B2

公开(公告)日：2022-12-27

申请号：US16866753

申请日：2020-05-05

Applicant: DeepMind Technologies Limited

Inventor： Simon Osindero , Koray Kavukcuoglu , Alexander Vezhnevets

IPC: G06N3/08 , G06N3/04

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for a system configured to select actions to be performed by an agent that interacts with an environment. The system comprises a manager neural network subsystem and a worker neural network subsystem. The manager subsystem is configured to, at each of the multiple time steps, generate a final goal vector for the time step. The worker subsystem is configured to, at each of multiple time steps, use the final goal vector generated by the manager subsystem to generate a respective action score for each action in a predetermined set of actions.

4.

发明申请
ACTION SELECTION FOR REINFORCEMENT LEARNING USING NEURAL NETWORKS 审中-公开

公开(公告)号：US20200265313A1

公开(公告)日：2020-08-20

申请号：US16866753

申请日：2020-05-05

Applicant: DeepMind Technologies Limited

Inventor： Simon Osindero , Koray Kavukcuoglu , Alexander Vezhnevets

IPC: G06N3/08 , G06N3/04

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for a system configured to select actions to be performed by an agent that interacts with an environment. The system comprises a manager neural network subsystem and a worker neural network subsystem. The manager subsystem is configured to, at each of the multiple time steps, generate a final goal vector for the time step. The worker subsystem is configured to, at each of multiple time steps, use the final goal vector generated by the manager subsystem to generate a respective action score for each action in a predetermined set of actions.

5.

发明申请
ACTION SELECTION FOR REINFORCEMENT LEARNING USING NEURAL NETWORKS 审中-公开

公开(公告)号：US20190340509A1

公开(公告)日：2019-11-07

申请号：US16511571

申请日：2019-07-15

Applicant: DeepMind Technologies Limited

Inventor： Simon Osindero , Koray Kavukcuoglu , Alexander Vezhnevets

IPC: G06N3/08 , G06N3/04

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for a system configured to select actions to be performed by an agent that interacts with an environment. The system comprises a manager neural network subsystem and a worker neural network subsystem. The manager subsystem is configured to, at each of the multiple time steps, generate a final goal vector for the time step. The worker subsystem is configured to, at each of multiple time steps, use the final goal vector generated by the manager subsystem to generate a respective action score for each action in a predetermined set of actions.

Patent Agency Ranking