Patent search ap:("GOOGLE LLC") AND inv:"Sergey Levine" Page 1

1.

发明授权
Deep reinforcement learning for robotic manipulation 有权

公开(公告)号：US11845183B2

公开(公告)日：2023-12-19

申请号：US17878186

申请日：2022-08-01

Applicant: Google LLC

Inventor： Sergey Levine , Ethan Holly , Shixiang Gu , Timothy Lillicrap

IPC: G06F17/00 , B25J9/16 , G05B13/02 , G06N3/08 , G06N3/008 , G06N3/045 , G05B19/042

CPC classification number: B25J9/161 , B25J9/163 , B25J9/1664 , G05B13/027 , G05B19/042 , G06N3/008 , G06N3/045 , G06N3/08 , G05B2219/32335 , G05B2219/33033 , G05B2219/33034 , G05B2219/39001 , G05B2219/39298 , G05B2219/40499

Abstract: Implementations utilize deep reinforcement learning to train a policy neural network that parameterizes a policy for determining a robotic action based on a current state. Some of those implementations collect experience data from multiple robots that operate simultaneously. Each robot generates instances of experience data during iterative performance of episodes that are each explorations of performing a task, and that are each guided based on the policy network and the current policy parameters for the policy network during the episode. The collected experience data is generated during the episodes and is used to train the policy network by iteratively updating policy parameters of the policy network based on a batch of collected experience data. Further, prior to performance of each of a plurality of episodes performed by the robots, the current updated policy parameters can be provided (or retrieved) for utilization in performance of the episode.

2.

发明授权
Deep machine learning methods and apparatus for robotic grasping 有权

公开(公告)号：US11548145B2

公开(公告)日：2023-01-10

申请号：US17172666

申请日：2021-02-10

Applicant: Google LLC

Inventor： Sergey Levine , Peter Pastor Sampedro , Alex Krizhevsky

IPC: G06F17/00 , B25J9/16 , G05B13/02 , G06N3/04 , G06N3/08

Abstract: Deep machine learning methods and apparatus related to manipulation of an object by an end effector of a robot. Some implementations relate to training a deep neural network to predict a measure that candidate motion data for an end effector of a robot will result in a successful grasp of one or more objects by the end effector. Some implementations are directed to utilization of the trained deep neural network to servo a grasping end effector of a robot to achieve a successful grasp of an object by the grasping end effector. For example, the trained deep neural network may be utilized in the iterative updating of motion control commands for one or more actuators of a robot that control the pose of a grasping end effector of the robot, and to determine when to generate grasping control commands to effectuate an attempted grasp by the grasping end effector.

3.

发明公开
DEEP REINFORCEMENT LEARNING FOR ROBOTIC MANIPULATION 审中-公开

公开(公告)号：US20240131695A1

公开(公告)日：2024-04-25

申请号：US18526443

申请日：2023-12-01

Applicant: GOOGLE LLC

Inventor： Sergey Levine , Ethan Holly , Shixiang Gu , Timothy Lillicrap

IPC: B25J9/16 , G05B13/02 , G05B19/042 , G06N3/008 , G06N3/045 , G06N3/08

CPC classification number: B25J9/161 , B25J9/163 , B25J9/1664 , G05B13/027 , G05B19/042 , G06N3/008 , G06N3/045 , G06N3/08 , G05B2219/32335 , G05B2219/33033 , G05B2219/33034 , G05B2219/39001 , G05B2219/39298 , G05B2219/40499

Abstract: Implementations utilize deep reinforcement learning to train a policy neural network that parameterizes a policy for determining a robotic action based on a current state. Some of those implementations collect experience data from multiple robots that operate simultaneously. Each robot generates instances of experience data during iterative performance of episodes that are each explorations of performing a task, and that are each guided based on the policy network and the current policy parameters for the policy network during the episode. The collected experience data is generated during the episodes and is used to train the policy network by iteratively updating policy parameters of the policy network based on a batch of collected experience data. Further, prior to performance of each of a plurality of episodes performed by the robots, the current updated policy parameters can be provided (or retrieved) for utilization in performance of the episode.

4.

发明申请
MACHINE LEARNING METHODS AND APPARATUS FOR SEMANTIC ROBOTIC GRASPING 审中-公开

公开(公告)号：US20200338722A1

公开(公告)日：2020-10-29

申请号：US16622309

申请日：2018-06-28

Applicant: Google LLC

Inventor： Eric Jang , Sudheendra Vijayanarasimhan , Peter Pastor Sampedro , Julian Ibarz , Sergey Levine

IPC: B25J9/16 , G06N3/04 , G06N3/08

Abstract: Deep machine learning methods and apparatus related to semantic robotic grasping are provided. Some implementations relate to training a training a grasp neural network, a semantic neural network, and a joint neural network of a semantic grasping model. In some of those implementations, the joint network is a deep neural network and can be trained based on both: grasp losses generated based on grasp predictions generated over a grasp neural network, and semantic losses generated based on semantic predictions generated over the semantic neural network. Some implementations are directed to utilization of the trained semantic grasping model to servo, or control, a grasping end effector of a robot to achieve a successful grasp of an object having desired semantic feature(s).

5.

发明授权
Deep reinforcement learning for robotic manipulation 有权

公开(公告)号：US12240113B2

公开(公告)日：2025-03-04

申请号：US18526443

申请日：2023-12-01

Applicant: GOOGLE LLC

Inventor： Sergey Levine , Ethan Holly , Shixiang Gu , Timothy Lillicrap

IPC: G06F17/00 , B25J9/16 , G05B13/02 , G05B19/042 , G06N3/008 , G06N3/045 , G06N3/08

Abstract: Implementations utilize deep reinforcement learning to train a policy neural network that parameterizes a policy for determining a robotic action based on a current state. Some of those implementations collect experience data from multiple robots that operate simultaneously. Each robot generates instances of experience data during iterative performance of episodes that are each explorations of performing a task, and that are each guided based on the policy network and the current policy parameters for the policy network during the episode. The collected experience data is generated during the episodes and is used to train the policy network by iteratively updating policy parameters of the policy network based on a batch of collected experience data. Further, prior to performance of each of a plurality of episodes performed by the robots, the current updated policy parameters can be provided (or retrieved) for utilization in performance of the episode.

6.

发明授权
System and methods for training robot policies in the real world 有权

公开(公告)号：US11992945B2

公开(公告)日：2024-05-28

申请号：US17094521

申请日：2020-11-10

Applicant: Google LLC

Inventor： Jie Tan , Sehoon Ha , Peng Xu , Sergey Levine , Zhenyu Tan

IPC: B25J9/16 , B25J13/08 , G05D1/00 , G06N3/08

CPC classification number: B25J9/163 , B25J9/162 , B25J9/1689 , B25J13/089 , G05D1/02 , G06N3/08

Abstract: Techniques are disclosed that enable training a plurality of policy networks, each policy network corresponding to a disparate robotic training task, using a mobile robot in a real world workspace. Various implementations include selecting a training task based on comparing a pose of the mobile robot to at least one parameter of a real world training workspace. For example, the training task can be selected based on the position of a landmark, within the workspace, relative to the pose. For instance, the training task can be selected such that the selected training task moves the mobile robot towards the landmark.

7.

发明授权
Data-efficient hierarchical reinforcement learning 有权

公开(公告)号：US11992944B2

公开(公告)日：2024-05-28

申请号：US17050546

申请日：2019-05-17

Applicant: Google LLC

Inventor： Honglak Lee , Shixiang Gu , Sergey Levine

IPC: B25J9/16

CPC classification number: B25J9/163

Abstract: Training and/or utilizing a hierarchical reinforcement learning (HRL) model for robotic control. The HRL model can include at least a higher-level policy model and a lower-level policy model. Some implementations relate to technique(s) that enable more efficient off-policy training to be utilized in training of the higher-level policy model and/or the lower-level policy model. Some of those implementations utilize off-policy correction, which re-labels higher-level actions of experience data, generated in the past utilizing a previously trained version of the HRL model, with modified higher-level actions. The modified higher-level actions are then utilized to off-policy train the higher-level policy model. This can enable effective off-policy training despite the lower-level policy model being a different version at training time (relative to the version when the experience data was collected).

8.

发明公开
Offline Primitive Discovery For Accelerating Data-Driven Reinforcement Learning 审中-公开

公开(公告)号：US20230367996A1

公开(公告)日：2023-11-16

申请号：US18044852

申请日：2021-09-23

Applicant: Google LLC

Inventor： Anurag Ajay , Ofir Nachum , Aviral Kumar , Sergey Levine

IPC: G06N3/0455 , G06N3/092

CPC classification number: G06N3/0455 , G06N3/092

Abstract: A method includes determining a first state associated with a particular task, and determining, by a task policy model, a latent space representation of the first state. The task policy model may have been trained to define, for each respective state of a plurality of possible states associated with the particular task, a corresponding latent space representation of the respective state. The method also includes determining, by a primitive policy model and based on the first state and the latent space representation of the first state, an action to take as part of the particular task. The primitive policy model may have been trained to define a space of primitive policies for the plurality of possible states associated with the particular task and a plurality of possible latent space representations. The method further includes executing the action to reach a second state associated with the particular task.

9.

发明公开
NATURAL LANGUAGE CONTROL OF A ROBOT 审中-公开

公开(公告)号：US20230311335A1

公开(公告)日：2023-10-05

申请号：US18128953

申请日：2023-03-30

Applicant: GOOGLE LLC

Inventor： Karol Hausman , Brian Ichter , Sergey Levine , Alexander Toshev , Fei Xia , Carolina Parada

IPC: B25J13/00 , B25J11/00 , B25J9/16 , G06F40/40

CPC classification number: B25J13/003 , B25J11/0005 , B25J9/163 , B25J9/161 , G06F40/40

Abstract: Implementations process, using a large language model, a free-form natural language (NL) instruction to generate to generate LLM output. Those implementations generate, based on the LLM output and a NL skill description of a robotic skill, a task-grounding measure that reflects a probability of the skill description in the probability distribution of the LLM output. Those implementations further generate, based on the robotic skill and current environmental state data, a world-grounding measure that reflects a probability of the robotic skill being successful based on the current environmental state data. Those implementations further determine, based on both the task-grounding measure and the world-grounding measure, whether to implement the robotic skill.

10.

发明授权
Viewpoint invariant visual servoing of robot end effector using recurrent neural network 有权

公开(公告)号：US11701773B2

公开(公告)日：2023-07-18

申请号：US16622181

申请日：2018-12-04

Applicant: Google LLC

Inventor： Alexander Toshev , Fereshteh Sadeghi , Sergey Levine

IPC: G05B19/04 , G05B19/18 , B25J9/16 , G05B13/02 , G06N3/084 , G06N3/044 , G06N3/045

CPC classification number: B25J9/163 , B25J9/1697 , G05B13/027 , G06N3/044 , G06N3/045 , G06N3/084 , G05B2219/33056 , G05B2219/39391 , G05B2219/40499 , G05B2219/42152

Abstract: Training and/or using a recurrent neural network model for visual servoing of an end effector of a robot. In visual servoing, the model can be utilized to generate, at each of a plurality of time steps, an action prediction that represents a prediction of how the end effector should be moved to cause the end effector to move toward a target object. The model can be viewpoint invariant in that it can be utilized across a variety of robots having vision components at a variety of viewpoints and/or can be utilized for a single robot even when a viewpoint, of a vision component of the robot, is drastically altered. Moreover, the model can be trained based on a large quantity of simulated data that is based on simulator(s) performing simulated episode(s) in view of the model. One or more portions of the model can be further trained based on a relatively smaller quantity of real training data.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification