Off-policy control policy evaluation

    公开(公告)号:US11477243B2

    公开(公告)日:2022-10-18

    申请号:US16827596

    申请日:2020-03-23

    Applicant: Google LLC

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for off-policy evaluation of a control policy. One of the methods includes obtaining policy data specifying a control policy for controlling a source agent interacting with a source environment to perform a particular task; obtaining a validation data set generated from interactions of a target agent in a target environment; determining a performance estimate that represents an estimate of a performance of the control policy in controlling the target agent to perform the particular task in the target environment; and determining, based on the performance estimate, whether to deploy the control policy for controlling the target agent to perform the particular task in the target environment.

    MACHINE LEARNING METHODS AND APPARATUS FOR SEMANTIC ROBOTIC GRASPING

    公开(公告)号:US20200338722A1

    公开(公告)日:2020-10-29

    申请号:US16622309

    申请日:2018-06-28

    Applicant: Google LLC

    Abstract: Deep machine learning methods and apparatus related to semantic robotic grasping are provided. Some implementations relate to training a training a grasp neural network, a semantic neural network, and a joint neural network of a semantic grasping model. In some of those implementations, the joint network is a deep neural network and can be trained based on both: grasp losses generated based on grasp predictions generated over a grasp neural network, and semantic losses generated based on semantic predictions generated over the semantic neural network. Some implementations are directed to utilization of the trained semantic grasping model to servo, or control, a grasping end effector of a robot to achieve a successful grasp of an object having desired semantic feature(s).

    DEEP REINFORCEMENT LEARNING FOR ROBOTIC MANIPULATION

    公开(公告)号:US20210237266A1

    公开(公告)日:2021-08-05

    申请号:US17052679

    申请日:2019-06-14

    Applicant: Google LLC

    Abstract: Using large-scale reinforcement learning to train a policy model that can be utilized by a robot in performing a robotic task in which the robot interacts with one or more environmental objects. In various implementations, off-policy deep reinforcement learning is used to train the policy model, and the off-policy deep reinforcement learning is based on self-supervised data collection. The policy model can be a neural network model. Implementations of the reinforcement learning utilized in training the neural network model utilize a continuous-action variant of Q-learning. Through techniques disclosed herein, implementations can learn policies that generalize effectively to previously unseen objects, previously unseen environments, etc.

    OFF-POLICY CONTROL POLICY EVALUATION
    8.
    发明申请

    公开(公告)号:US20200304545A1

    公开(公告)日:2020-09-24

    申请号:US16827596

    申请日:2020-03-23

    Applicant: Google LLC

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for off-policy evaluation of a control policy. One of the methods includes obtaining policy data specifying a control policy for controlling a source agent interacting with a source environment to perform a particular task; obtaining a validation data set generated from interactions of a target agent in a target environment; determining a performance estimate that represents an estimate of a performance of the control policy in controlling the target agent to perform the particular task in the target environment; and determining, based on the performance estimate, whether to deploy the control policy for controlling the target agent to perform the particular task in the target environment.

Patent Agency Ranking