Patent search ap:("GOOGLE LLC") AND inv:"Sergey Levine" Page 2

11.

发明授权
Viewpoint invariant visual servoing of robot end effector using recurrent neural network 有权

公开(公告)号：US11701773B2

公开(公告)日：2023-07-18

申请号：US16622181

申请日：2018-12-04

Applicant: Google LLC

Inventor： Alexander Toshev , Fereshteh Sadeghi , Sergey Levine

IPC: G05B19/04 , G05B19/18 , B25J9/16 , G05B13/02 , G06N3/084 , G06N3/044 , G06N3/045

CPC classification number: B25J9/163 , B25J9/1697 , G05B13/027 , G06N3/044 , G06N3/045 , G06N3/084 , G05B2219/33056 , G05B2219/39391 , G05B2219/40499 , G05B2219/42152

Abstract: Training and/or using a recurrent neural network model for visual servoing of an end effector of a robot. In visual servoing, the model can be utilized to generate, at each of a plurality of time steps, an action prediction that represents a prediction of how the end effector should be moved to cause the end effector to move toward a target object. The model can be viewpoint invariant in that it can be utilized across a variety of robots having vision components at a variety of viewpoints and/or can be utilized for a single robot even when a viewpoint, of a vision component of the robot, is drastically altered. Moreover, the model can be trained based on a large quantity of simulated data that is based on simulator(s) performing simulated episode(s) in view of the model. One or more portions of the model can be further trained based on a relatively smaller quantity of real training data.

12.

发明申请
EFFICIENT ADAPTION OF ROBOT CONTROL POLICY FOR NEW TASK USING META-LEARNING BASED ON META-IMITATION LEARNING AND META-REINFORCEMENT LEARNING 有权

公开(公告)号：US20220105624A1

公开(公告)日：2022-04-07

申请号：US17422260

申请日：2020-01-23

Applicant: Google LLC

Inventor： Mrinal Kalakrishnan , Yunfei Bai , Paul Wohlhart , Eric Jang , Chelsea Finn , Seyed Mohammad Khansari Zadeh , Sergey Levine , Allan Zhou , Alexander Herzog , Daniel Kappler

IPC: B25J9/16

Abstract: Techniques are disclosed that enable training a meta-learning model, for use in causing a robot to perform a task, using imitation learning as well as reinforcement learning. Some implementations relate to training the meta-learning model using imitation learning based on one or more human guided demonstrations of the task. Additional or alternative implementations relate to training the meta-learning model using reinforcement learning based on trials of the robot attempting to perform the task. Further implementations relate to using the trained meta-learning model to few shot (or one shot) learn a new task based on a human guided demonstration of the new task.

13.

发明申请
DATA-EFFICIENT HIERARCHICAL REINFORCEMENT LEARNING 有权

公开(公告)号：US20210187733A1

公开(公告)日：2021-06-24

申请号：US17050546

申请日：2019-05-17

Applicant: Google LLC

Inventor： Honglak Lee , Shixiang Gu , Sergey Levine

IPC: B25J9/16

Abstract: Training and/or utilizing a hierarchical reinforcement learning (HRL) model for robotic control. The HRL model can include at least a higher-level policy model and a lower-level policy model. Some implementations relate to technique(s) that enable more efficient off-policy training to be utilized in training of the higher-level policy model and/or the lower-level policy model. Some of those implementations utilize off-policy correction, which re-labels higher-level actions of experience data, generated in the past utilizing a previously trained version of the HRL model, with modified higher-level actions. The modified higher-level actions are then utilized to off-policy train the higher-level policy model. This can enable effective off-policy training despite the lower-level policy model being a different version at training time (relative to the version when the experience data was collected).

14.

发明申请
DEEP MACHINE LEARNING METHODS AND APPARATUS FOR ROBOTIC GRASPING 审中-公开

公开(公告)号：US20180147723A1

公开(公告)日：2018-05-31

申请号：US15881189

申请日：2018-01-26

Applicant: Google LLC

Inventor： Sudheendra Vijayanarasimhan , Eric Jang , Peter Pastor Sampedro , Sergey Levine

IPC: B25J9/16 , G06N3/04 , G06N3/08 , G05B19/18 , G05B13/02

CPC classification number: B25J9/163 , B25J9/1612 , B25J9/1697 , G05B13/027 , G05B19/18 , G06N3/008 , G06N3/0454 , G06N3/08 , G06N3/084 , Y10S901/36

Abstract: Deep machine learning methods and apparatus related to manipulation of an object by an end effector of a robot. Some implementations relate to training a semantic grasping model to predict a measure that indicates whether motion data for an end effector of a robot will result in a successful grasp of an object; and to predict an additional measure that indicates whether the object has desired semantic feature(s). Some implementations are directed to utilization of the trained semantic grasping model to servo a grasping end effector of a robot to achieve a successful grasp of an object having desired semantic feature(s).

15.

发明申请
DEEP REINFORCEMENT LEARNING FOR ROBOTIC MANIPULATION 有权

公开(公告)号：US20250153352A1

公开(公告)日：2025-05-15

申请号：US19025551

申请日：2025-01-16

Applicant: GOOGLE LLC

Inventor： Sergey Levine , Ethan Holly , Shixiang Gu , Timothy Lillicrap

IPC: B25J9/16 , G05B13/02 , G05B19/042 , G06N3/008 , G06N3/045 , G06N3/08

Abstract: Implementations utilize deep reinforcement learning to train a policy neural network that parameterizes a policy for determining a robotic action based on a current state. Some of those implementations collect experience data from multiple robots that operate simultaneously. Each robot generates instances of experience data during iterative performance of episodes that are each explorations of performing a task, and that are each guided based on the policy network and the current policy parameters for the policy network during the episode. The collected experience data is generated during the episodes and is used to train the policy network by iteratively updating policy parameters of the policy network based on a batch of collected experience data. Further, prior to performance of each of a plurality of episodes performed by the robots, the current updated policy parameters can be provided (or retrieved) for utilization in performance of the episode.

16.

发明授权
System(s) and method(s) of using imitation learning in training and refining robotic control policies 有权

公开(公告)号：US12226920B2

公开(公告)日：2025-02-18

申请号：US18233251

申请日：2023-08-11

Applicant: GOOGLE LLC

Inventor： Seyed Mohammad Khansari Zadeh , Eric Jang , Daniel Lam , Daniel Kappler , Matthew Bennice , Brent Austin , Yunfei Bai , Sergey Levine , Alexander Irpan , Nicolas Sievers , Chelsea Finn

IPC: B25J9/16 , B25J13/06

Abstract: Implementations described herein relate to training and refining robotic control policies using imitation learning techniques. A robotic control policy can be initially trained based on human demonstrations of various robotic tasks. Further, the robotic control policy can be refined based on human interventions while a robot is performing a robotic task. In some implementations, the robotic control policy may determine whether the robot will fail in performance of the robotic task, and prompt a human to intervene in performance of the robotic task. In additional or alternative implementations, a representation of the sequence of actions can be visually rendered for presentation to the human can proactively intervene in performance of the robotic task.

17.

发明公开
DATA-EFFICIENT HIERARCHICAL REINFORCEMENT LEARNING 审中-公开

公开(公告)号：US20240308068A1

公开(公告)日：2024-09-19

申请号：US18673510

申请日：2024-05-24

Applicant: GOOGLE LLC

Inventor： Honglak Lee , Shixiang Gu , Sergey Levine

IPC: B25J9/16

CPC classification number: B25J9/163

Abstract: Training and/or utilizing a hierarchical reinforcement learning (HRL) model for robotic control. The HRL model can include at least a higher-level policy model and a lower-level policy model. Some implementations relate to technique(s) that enable more efficient off-policy training to be utilized in training of the higher-level policy model and/or the lower-level policy model. Some of those implementations utilize off-policy correction, which re-labels higher-level actions of experience data, generated in the past utilizing a previously trained version of the HRL model, with modified higher-level actions. The modified higher-level actions are then utilized to off-policy train the higher-level policy model. This can enable effective off-policy training despite the lower-level policy model being a different version at training time (relative to the version when the experience data was collected).

18.

发明公开
MITIGATING REALITY GAP THROUGH TRAINING A SIMULATION-TO-REAL MODEL USING A VISION-BASED ROBOT TASK MODEL 审中-公开

公开(公告)号：US20240118667A1

公开(公告)日：2024-04-11

申请号：US17767675

申请日：2020-05-15

Applicant: GOOGLE LLC

Inventor： Kanishka Rao , Chris Harris , Julian Ibarz , Alexander Irpan , Seyed Mohammad Khansari Zadeh , Sergey Levine

IPC: G05B13/02 , B25J9/16 , B25J19/02

CPC classification number: G05B13/0265 , B25J9/1605 , B25J9/163 , B25J9/1697 , B25J19/023

Abstract: Implementations disclosed herein relate to mitigating the reality gap through training a simulation-to-real machine learning model (“Sim2Real” model) using a vision-based robot task machine learning model. The vision-based robot task machine learning model can be, for example, a reinforcement learning (“RL”) neural network model (RL-network), such as an RL-network that represents a Q-function.

19.

发明授权
Deep reinforcement learning for robotic manipulation 有权

公开(公告)号：US11897133B2

公开(公告)日：2024-02-13

申请号：US17878186

申请日：2022-08-01

Applicant: Google LLC

Inventor： Sergey Levine , Ethan Holly , Shixiang Gu , Timothy Lillicrap

IPC: G06F17/00 , B25J9/16 , G05B13/02 , G06N3/08 , G06N3/008 , G06N3/045 , G05B19/042

CPC classification number: B25J9/161 , B25J9/163 , B25J9/1664 , G05B13/027 , G05B19/042 , G06N3/008 , G06N3/045 , G06N3/08 , G05B2219/32335 , G05B2219/33033 , G05B2219/33034 , G05B2219/39001 , G05B2219/39298 , G05B2219/40499

Abstract: Implementations utilize deep reinforcement learning to train a policy neural network that parameterizes a policy for determining a robotic action based on a current state. Some of those implementations collect experience data from multiple robots that operate simultaneously. Each robot generates instances of experience data during iterative performance of episodes that are each explorations of performing a task, and that are each guided based on the policy network and the current policy parameters for the policy network during the episode. The collected experience data is generated during the episodes and is used to train the policy network by iteratively updating policy parameters of the policy network based on a batch of collected experience data. Further, prior to performance of each of a plurality of episodes performed by the robots, the current updated policy parameters can be provided (or retrieved) for utilization in performance of the episode.

20.

发明公开
SYSTEM(S) AND METHOD(S) OF USING IMITATION LEARNING IN TRAINING AND REFINING ROBOTIC CONTROL POLICIES 审中-公开

公开(公告)号：US20230381970A1

公开(公告)日：2023-11-30

申请号：US18233251

申请日：2023-08-11

Applicant: GOOGLE LLC

Inventor： Seyed Mohammad Khansari Zadeh , Eric Jang , Daniel Lam , Daniel Kappler , Matthew Bennice , Brent Austin , Yunfei Bai , Sergey Levine , Alexander Irpan , Nicolas Sievers , Chelsea Finn

IPC: B25J9/16 , B25J13/06

CPC classification number: B25J9/1697 , B25J9/163 , B25J9/1661 , B25J9/161 , B25J13/06

Abstract: Implementations described herein relate to training and refining robotic control policies using imitation learning techniques. A robotic control policy can be initially trained based on human demonstrations of various robotic tasks. Further, the robotic control policy can be refined based on human interventions while a robot is performing a robotic task. In some implementations, the robotic control policy may determine whether the robot will fail in performance of the robotic task, and prompt a human to intervene in performance of the robotic task. In additional or alternative implementations, a representation of the sequence of actions can be visually rendered for presentation to the human can proactively intervene in performance of the robotic task.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification