Patent search ap:("X Development LLC") AND inv:"Yevgen Chebotar" Page 1

1.

发明授权
Control policies for collective robot learning 有权

公开(公告)号：US11188821B1

公开(公告)日：2021-11-30

申请号：US15705601

申请日：2017-09-15

Applicant: X Development LLC

Inventor： Mrinal Kalakrishnan , Ali Hamid Yahya Valdovinos , Adrian Ling Hin Li , Yevgen Chebotar , Sergey Vladimir Levine

IPC: G06N3/08 , G06N20/00 , B25J9/16

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, of training a global policy neural network. One of the methods includes initializing an instance of the robotic task for multiple local workers, generating a trajectory of state-action pairs by selecting actions to be performed by the robotic agent while performing the instance of the robotic task, optimizing a local policy controller on the trajectory, generating an optimized trajectory using the optimized local controller, and storing the optimized trajectory in a replay memory associated with the local worker. The method includes sampling, for multiple global workers, an optimized trajectory from one of one or more replay memories associated with the global worker, and training the replica of the global policy neural network maintained by the global worker on the sampled optimized trajectory to determine delta values for the parameters of the global policy neural network.

2.

发明授权
Control policies for robotic agents 有权

公开(公告)号：US10960539B1

公开(公告)日：2021-03-30

申请号：US15705655

申请日：2017-09-15

Applicant: X Development LLC

Inventor： Mrinal Kalakrishnan , Ali Hamid Yahya Valdovinos , Adrian Ling Hin Li , Yevgen Chebotar , Sergey Vladimir Levine

IPC: B25J9/16 , G05B13/02

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, of training a global policy neural network. One of the methods includes initializing a plurality of instances of the robotic task. For each instance of the robotic task, the method includes generating a trajectory of state-action pairs by selecting actions to be performed by the robotic agent while performing the instance of the robotic task in accordance with current values of the parameters of the global policy neural network, and optimizing a local policy controller that is specific to the instance on the trajectory of state-action pairs for the instance. The method further includes generating training data for the global policy neural network using the local policy controllers, and training the global policy neural network on the training data to adjust the current values of the parameters of the global policy neural network.

3.

发明申请
LEARNING ROBOTIC SKILLS WITH IMITATION AND REINFORCEMENT AT SCALE 有权

公开(公告)号：US20220410380A1

公开(公告)日：2022-12-29

申请号：US17843288

申请日：2022-06-17

Applicant: X Development LLC

Inventor： Yao Lu , Mengyuan Yan , Seyed Mohammad Khansari Zadeh , Alexander Herzog , Eric Jang , Karol Hausman , Yevgen Chebotar , Sergey Levine , Alexander Irpan

IPC: B25J9/16

Abstract: Utilizing an initial set of offline positive-only robotic demonstration data for pre-training an actor network and a critic network for robotic control, followed by further training of the networks based on online robotic episodes that utilize the network(s). Implementations enable the actor network to be effectively pre-trained, while mitigating occurrences of and/or the extent of forgetting when further trained based on episode data. Implementations additionally or alternatively enable the actor network to be trained to a given degree of effectiveness in fewer training steps. In various implementations, one or more adaptation techniques are utilized in performing the robotic episodes and/or in performing the robotic training. The adaptation techniques can each, individually, result in one or more corresponding advantages and, when used in any combination, the corresponding advantages can accumulate. The adaptation techniques include Positive Sample Filtering, Adaptive Exploration, Using Max Q Values, and Using the Actor in CEM.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification