-
公开(公告)号:US20200304545A1
公开(公告)日:2020-09-24
申请号:US16827596
申请日:2020-03-23
Applicant: Google LLC
Inventor: Kanury Kanishka Rao , Konstantinos Bousmalis , Christopher K. Harris , Alexander Irpan , Sergey Vladimir Levine , Julian Ibarz
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for off-policy evaluation of a control policy. One of the methods includes obtaining policy data specifying a control policy for controlling a source agent interacting with a source environment to perform a particular task; obtaining a validation data set generated from interactions of a target agent in a target environment; determining a performance estimate that represents an estimate of a performance of the control policy in controlling the target agent to perform the particular task in the target environment; and determining, based on the performance estimate, whether to deploy the control policy for controlling the target agent to perform the particular task in the target environment.
-
公开(公告)号:US20200279134A1
公开(公告)日:2020-09-03
申请号:US16649599
申请日:2018-09-20
Applicant: GOOGLE LLC
Inventor: Konstantinos Bousmalis , Alexander Irpan , Paul Wohlhart , Yunfei Bai , Mrinal Kalakrishnan , Julian Ibarz , Sergey Vladimir Levine , Kurt Konolige , Vincent O. Vanhoucke , Matthew Laurance Kelcey
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training an action selection neural network that is used to control a robotic agent interacting with a real-world environment.
-
公开(公告)号:US20190385022A1
公开(公告)日:2019-12-19
申请号:US16443765
申请日:2019-06-17
Applicant: Google LLC
Inventor: Eric Victor Jang , Sergey Vladimir Levine , Coline Manon Devin
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training an object representation neural network. One of the methods includes obtaining training sets of images, each training set comprising: (i) a before image of a before scene of the environment, (ii) an after image of an after scene of the environment after the robot has removed a particular object, and (iii) an object image of the particular object, and training the object representation neural network on the batch of training data, comprising determining an update to the object representation parameters that encourages the vector embedding of the particular object in each training set to be closer to a difference between (i) the vector embedding of the after scene in the training set and (ii) the vector embedding of the before scene in the training set.
-
公开(公告)号:US20190251437A1
公开(公告)日:2019-08-15
申请号:US16332961
申请日:2017-09-15
Applicant: Google LLC
Inventor: Chelsea Breanna Finn , Sergey Vladimir Levine
CPC classification number: G06N3/08 , G06N3/008 , G06N3/04 , G06N3/0445 , G06N3/0454
Abstract: A method includes: receiving data identifying, for each of one or more objects, a respective target location to which a robotic agent interacting with a real-world environment should move the object; causing the robotic agent to move the one or more objects to the one or more target locations by repeatedly performing the following: receiving a current image of a current state of the real-world environment; determining, from the current image, a next sequence of actions to be performed by the robotic agent using a next image prediction neural network that predicts future images based on a current action and an action to be performed by the robotic agent; and directing the robotic agent to perform the next sequence of actions.
-
-
-