-
21.
公开(公告)号:US11772272B2
公开(公告)日:2023-10-03
申请号:US17203296
申请日:2021-03-16
Applicant: GOOGLE LLC
Inventor: Seyed Mohammad Khansari Zadeh , Eric Jang , Daniel Lam , Daniel Kappler , Matthew Bennice , Brent Austin , Yunfei Bai , Sergey Levine , Alexander Irpan , Nicolas Sievers , Chelsea Finn
CPC classification number: B25J9/1697 , B25J9/161 , B25J9/163 , B25J9/1661 , B25J13/06
Abstract: Implementations described herein relate to training and refining robotic control policies using imitation learning techniques. A robotic control policy can be initially trained based on human demonstrations of various robotic tasks. Further, the robotic control policy can be refined based on human interventions while a robot is performing a robotic task. In some implementations, the robotic control policy may determine whether the robot will fail in performance of the robotic task, and prompt a human to intervene in performance of the robotic task. In additional or alternative implementations, a representation of the sequence of actions can be visually rendered for presentation to the human can proactively intervene in performance of the robotic task.
-
公开(公告)号:US20210237266A1
公开(公告)日:2021-08-05
申请号:US17052679
申请日:2019-06-14
Applicant: Google LLC
Inventor: Dmitry Kalashnikov , Alexander Irpan , Peter Pastor Sampedro , Julian Ibarz , Alexander Herzog , Eric Jang , Deirdre Quillen , Ethan Holly , Sergey Levine
Abstract: Using large-scale reinforcement learning to train a policy model that can be utilized by a robot in performing a robotic task in which the robot interacts with one or more environmental objects. In various implementations, off-policy deep reinforcement learning is used to train the policy model, and the off-policy deep reinforcement learning is based on self-supervised data collection. The policy model can be a neural network model. Implementations of the reinforcement learning utilized in training the neural network model utilize a continuous-action variant of Q-learning. Through techniques disclosed herein, implementations can learn policies that generalize effectively to previously unseen objects, previously unseen environments, etc.
-
公开(公告)号:US11045949B2
公开(公告)日:2021-06-29
申请号:US16823947
申请日:2020-03-19
Applicant: Google LLC
Inventor: Sudheendra Vijayanarasimhan , Eric Jang , Peter Pastor Sampedro , Sergey Levine
Abstract: Deep machine learning methods and apparatus related to manipulation of an object by an end effector of a robot. Some implementations relate to training a semantic grasping model to predict a measure that indicates whether motion data for an end effector of a robot will result in a successful grasp of an object; and to predict an additional measure that indicates whether the object has desired semantic feature(s). Some implementations are directed to utilization of the trained semantic grasping model to servo a grasping end effector of a robot to achieve a successful grasp of an object having desired semantic feature(s).
-
公开(公告)号:US12083678B2
公开(公告)日:2024-09-10
申请号:US17422260
申请日:2020-01-23
Applicant: Google LLC
Inventor: Mrinal Kalakrishnan , Yunfei Bai , Paul Wohlhart , Eric Jang , Chelsea Finn , Seyed Mohammad Khansari Zadeh , Sergey Levine , Allan Zhou , Alexander Herzog , Daniel Kappler
IPC: B25J9/16
CPC classification number: B25J9/163 , G05B2219/40116 , G05B2219/40499
Abstract: Techniques are disclosed that enable training a meta-learning model, for use in causing a robot to perform a task, using imitation learning as well as reinforcement learning. Some implementations relate to training the meta-learning model using imitation learning based on one or more human guided demonstrations of the task. Additional or alternative implementations relate to training the meta-learning model using reinforcement learning based on trials of the robot attempting to perform the task. Further implementations relate to using the trained meta-learning model to few shot (or one shot) learn a new task based on a human guided demonstration of the new task.
-
25.
公开(公告)号:US20240017405A1
公开(公告)日:2024-01-18
申请号:US18222858
申请日:2023-07-17
Applicant: GOOGLE LLC
Inventor: Alexander Toshev , Fereshteh Sadeghi , Sergey Levine
CPC classification number: B25J9/163 , B25J9/1697 , G05B13/027 , G06N3/084 , G06N3/044 , G06N3/045 , G05B2219/33056 , G05B2219/39391 , G05B2219/40499 , G05B2219/42152
Abstract: Training and/or using a recurrent neural network model for visual servoing of an end effector of a robot. In visual servoing, the model can be utilized to generate, at each of a plurality of time steps, an action prediction that represents a prediction of how the end effector should be moved to cause the end effector to move toward a target object. The model can be viewpoint invariant in that it can be utilized across a variety of robots having vision components at a variety of viewpoints and/or can be utilized for a single robot even when a viewpoint, of a vision component of the robot, is drastically altered. Moreover, the model can be trained based on a large quantity of simulated data that is based on simulator(s) performing simulated episode(s) in view of the model. One or more portions of the model can be further trained based on a relatively smaller quantity of real training data.
-
公开(公告)号:US11717959B2
公开(公告)日:2023-08-08
申请号:US16622309
申请日:2018-06-28
Applicant: Google LLC
Inventor: Eric Jang , Sudheendra Vijayanarasimhan , Peter Pastor Sampedro , Julian Ibarz , Sergey Levine
CPC classification number: B25J9/163 , G06N3/008 , G06N3/045 , G06N3/08 , G05B2219/39536
Abstract: Deep machine learning methods and apparatus related to semantic robotic grasping are provided. Some implementations relate to training a training a grasp neural network, a semantic neural network, and a joint neural network of a semantic grasping model. In some of those implementations, the joint network is a deep neural network and can be trained based on both: grasp losses generated based on grasp predictions generated over a grasp neural network, and semantic losses generated based on semantic predictions generated over the semantic neural network. Some implementations are directed to utilization of the trained semantic grasping model to servo, or control, a grasping end effector of a robot to achieve a successful grasp of an object having desired semantic feature(s).
-
公开(公告)号:US20220388159A1
公开(公告)日:2022-12-08
申请号:US17878186
申请日:2022-08-01
Applicant: Google LLC
Inventor: Sergey Levine , Ethan Holly , Shixiang Gu , Timothy Lillicrap
Abstract: Implementations utilize deep reinforcement learning to train a policy neural network that parameterizes a policy for determining a robotic action based on a current state. Some of those implementations collect experience data from multiple robots that operate simultaneously. Each robot generates instances of experience data during iterative performance of episodes that are each explorations of performing a task, and that are each guided based on the policy network and the current policy parameters for the policy network during the episode. The collected experience data is generated during the episodes and is used to train the policy network by iteratively updating policy parameters of the policy network based on a batch of collected experience data. Further, prior to performance of each of a plurality of episodes performed by the robots, the current updated policy parameters can be provided (or retrieved) for utilization in performance of the episode.
-
公开(公告)号:US20220143819A1
公开(公告)日:2022-05-12
申请号:US17094521
申请日:2020-11-10
Applicant: Google LLC
Inventor: Jie Tan , Sehoon Ha , Peng Xu , Sergey Levine , Zhenyu Tan
Abstract: Techniques are disclosed that enable training a plurality of policy networks, each policy network corresponding to a disparate robotic training task, using a mobile robot in a real world workspace. Various implementations include selecting a training task based on comparing a pose of the mobile robot to at least one parameter of a real world training workspace. For example, the training task can be selected based on the position of a landmark, within the workspace, relative to the pose. For instance, the training task can be selected such that the selected training task moves the mobile robot towards the landmark.
-
公开(公告)号:US20220063089A1
公开(公告)日:2022-03-03
申请号:US17524185
申请日:2021-11-11
Applicant: GOOGLE LLC
Inventor: Sergey Levine , Chelsea Finn , Ian Goodfellow
Abstract: Some implementations of this specification are directed generally to deep machine learning methods and apparatus related to predicting motion(s) (if any) that will occur to object(s) in an environment of a robot in response to particular movement of the robot in the environment. Some implementations are directed to training a deep neural network model to predict at least one transformation (if any), of an image of a robot's environment, that will occur as a result of implementing at least a portion of a particular movement of the robot in the environment. The trained deep neural network model may predict the transformation based on input that includes the image and a group of robot movement parameters that define the portion of the particular movement.
-
公开(公告)号:US20210162590A1
公开(公告)日:2021-06-03
申请号:US17172666
申请日:2021-02-10
Applicant: Google LLC
Inventor: Sergey Levine , Peter Pastor Sampedro , Alex Krizhevsky
Abstract: Deep machine learning methods and apparatus related to manipulation of an object by an end effector of a robot. Some implementations relate to training a deep neural network to predict a measure that candidate motion data for an end effector of a robot will result in a successful grasp of one or more objects by the end effector. Some implementations are directed to utilization of the trained deep neural network to servo a grasping end effector of a robot to achieve a successful grasp of an object by the grasping end effector. For example, the trained deep neural network may be utilized in the iterative updating of motion control commands for one or more actuators of a robot that control the pose of a grasping end effector of the robot, and to determine when to generate grasping control commands to effectuate an attempted grasp by the grasping end effector.
-
-
-
-
-
-
-
-
-