Patent search ap:("DeepMind Technologies Limited") AND inv:"Robert David Fergus" Page 1

1.

发明公开
IMITATION LEARNING BASED ON PREDICTION OF OUTCOMES 审中-公开

公开(公告)号：US20240185082A1

公开(公告)日：2024-06-06

申请号：US18275722

申请日：2022-02-04

Applicant: DeepMind Technologies Limited

Inventor： Andrew Coulter Jaegle , Yury Sulsky , Gregory Duncan Wayne , Robert David Fergus

IPC: G06N3/092

CPC classification number: G06N3/092

Abstract: A method is proposed of training a policy model to generate action data for controlling an agent to perform a task in an environment. The method comprises: obtaining, for each of a plurality of performances of the task, a corresponding demonstrator trajectory comprising a plurality of sets of state data characterizing the environment at each of a plurality of corresponding successive time steps during the performance of the task; using the demonstrator trajectories to generate a demonstrator model, the demonstrator model being operative to generate, for any said demonstrator trajectory, a value indicative of the probability of the demonstrator trajectory occurring; and jointly training an imitator model and a policy model. The joint training is performed by: generating a plurality of imitation trajectories, each imitation trajectory being generated by repeatedly receiving state data indicating a state of the environment, using the policy model to generate action data indicative of an action, and causing the action to be performed by the agent; training the imitator model using the imitation trajectories, the imitator model being operative to generate, for any said imitation trajectory, a value indicative of the probability of the imitation trajectory occurring; and training the policy model using a reward function which is a measure of the similarity of the demonstrator model and the imitator model.

2.

发明申请
TRAINING A HIGH-LEVEL CONTROLLER TO GENERATE NATURAL LANGUAGE COMMANDS FOR CONTROLLING AN AGENT 有权

公开(公告)号：US20250093828A1

公开(公告)日：2025-03-20

申请号：US18892260

申请日：2024-09-20

Applicant: DeepMind Technologies Limited

Inventor： Arun Ahuja , Robert David Fergus , Ishita Dasgupta , Kavya Venkata Kota Sai Kopparapu

IPC: G05B13/02 , G06F40/58 , G06N3/09 , G06N3/092

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a high-level controller neural network for controlling an agent. In particular, the high-level controller neural network generates natural language commands that can be provided as input to a low-level controller neural network, which generates control outputs that can be used to control the agent.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification