-
公开(公告)号:US10946515B2
公开(公告)日:2021-03-16
申请号:US16234272
申请日:2018-12-27
Applicant: Google LLC
Inventor: Sergey Levine , Peter Pastor Sampedro , Alex Krizhevsky
Abstract: Deep machine learning methods and apparatus related to manipulation of an object by an end effector of a robot. Some implementations relate to training a deep neural network to predict a measure that candidate motion data for an end effector of a robot will result in a successful grasp of one or more objects by the end effector. Some implementations are directed to utilization of the trained deep neural network to servo a grasping end effector of a robot to achieve a successful grasp of an object by the grasping end effector. For example, the trained deep neural network may be utilized in the iterative updating of motion control commands for one or more actuators of a robot that control the pose of a grasping end effector of the robot, and to determine when to generate grasping control commands to effectuate an attempted grasp by the grasping end effector.
-
公开(公告)号:US20200215686A1
公开(公告)日:2020-07-09
申请号:US16823947
申请日:2020-03-19
Applicant: Google LLC
Inventor: Sudheendra Vijayanarasimhan , Eric Jang , Peter Pastor Sampedro , Sergey Levine
Abstract: Deep machine learning methods and apparatus related to manipulation of an object by an end effector of a robot. Some implementations relate to training a semantic grasping model to predict a measure that indicates whether motion data for an end effector of a robot will result in a successful grasp of an object; and to predict an additional measure that indicates whether the object has desired semantic feature(s). Some implementations are directed to utilization of the trained semantic grasping model to servo a grasping end effector of a robot to achieve a successful grasp of an object having desired semantic feature(s).
-
公开(公告)号:US20190283245A1
公开(公告)日:2019-09-19
申请号:US16234272
申请日:2018-12-27
Applicant: Google LLC
Inventor: Sergey Levine , Peter Pastor Sampedro , Alex Krizhevsky
Abstract: Deep machine learning methods and apparatus related to manipulation of an object by an end effector of a robot. Some implementations relate to training a deep neural network to predict a measure that candidate motion data for an end effector of a robot will result in a successful grasp of one or more objects by the end effector. Some implementations are directed to utilization of the trained deep neural network to servo a grasping end effector of a robot to achieve a successful grasp of an object by the grasping end effector. For example, the trained deep neural network may be utilized in the iterative updating of motion control commands for one or more actuators of a robot that control the pose of a grasping end effector of the robot, and to determine when to generate grasping control commands to effectuate an attempted grasp by the grasping end effector.
-
公开(公告)号:US20190232488A1
公开(公告)日:2019-08-01
申请号:US16333482
申请日:2017-09-14
Applicant: Google LLC
Inventor: Sergey Levine , Ethan Holly , Shixiang Gu , Timothy Lillicrap
IPC: B25J9/16 , G05B13/02 , G05B19/042
CPC classification number: B25J9/161 , B25J9/163 , B25J9/1664 , G05B13/027 , G05B19/042 , G05B2219/32335 , G05B2219/33033 , G05B2219/33034 , G05B2219/39001 , G05B2219/39298 , G05B2219/40499 , G06N3/008 , G06N3/0454 , G06N3/08
Abstract: Implementations utilize deep reinforcement learning to train a policy neural network that parameterizes a policy for determining a robotic action based on a current state. Some of those implementations collect experience data from multiple robots that operate simultaneously. Each robot generates instances of experience data during iterative performance of episodes that are each explorations of performing a task, and that are each guided based on the policy network and the current policy parameters for the policy network during the episode. The collected experience data is generated during the episodes and is used to train the policy network by iteratively updating policy parameters of the policy network based on a batch of collected experience data. Further, prior to performance of each of a plurality of episodes performed by the robots, the current updated policy parameters can be provided (or retrieved) for utilization in performance of the episode.
-
-
-