-
公开(公告)号:US20220297303A1
公开(公告)日:2022-09-22
申请号:US17203296
申请日:2021-03-16
Applicant: X Development LLC
Inventor: Seyed Mohammad Khansari Zadeh , Eric Jang , Daniel Lam , Daniel Kappler , Matthew Bennice , Brent Austin , Yunfei Bai , Sergey Levine , Alexander Irpan , Nicolas Sievers , Chelsea Finn
Abstract: Implementations described herein relate to training and refining robotic control policies using imitation learning techniques. A robotic control policy can be initially trained based on human demonstrations of various robotic tasks. Further, the robotic control policy can be refined based on human interventions while a robot is performing a robotic task. In some implementations, the robotic control policy may determine whether the robot will fail in performance of the robotic task, and prompt a human to intervene in performance of the robotic task. In additional or alternative implementations, a representation of the sequence of actions can be visually rendered for presentation to the human can proactively intervene in performance of the robotic task.
-
公开(公告)号:US20220410380A1
公开(公告)日:2022-12-29
申请号:US17843288
申请日:2022-06-17
Applicant: X Development LLC
Inventor: Yao Lu , Mengyuan Yan , Seyed Mohammad Khansari Zadeh , Alexander Herzog , Eric Jang , Karol Hausman , Yevgen Chebotar , Sergey Levine , Alexander Irpan
IPC: B25J9/16
Abstract: Utilizing an initial set of offline positive-only robotic demonstration data for pre-training an actor network and a critic network for robotic control, followed by further training of the networks based on online robotic episodes that utilize the network(s). Implementations enable the actor network to be effectively pre-trained, while mitigating occurrences of and/or the extent of forgetting when further trained based on episode data. Implementations additionally or alternatively enable the actor network to be trained to a given degree of effectiveness in fewer training steps. In various implementations, one or more adaptation techniques are utilized in performing the robotic episodes and/or in performing the robotic training. The adaptation techniques can each, individually, result in one or more corresponding advantages and, when used in any combination, the corresponding advantages can accumulate. The adaptation techniques include Positive Sample Filtering, Adaptive Exploration, Using Max Q Values, and Using the Actor in CEM.
-