Patent search ap:("X Development LLC") AND inv:"Karol Hausman" Page 1

1.

发明授权
Robotic control using value distributions 有权

公开(公告)号：US11571809B1

公开(公告)日：2023-02-07

申请号：US17017920

申请日：2020-09-11

Applicant: X Development LLC

Inventor： Cristian Bodnar , Adrian Li , Karol Hausman , Peter Pastor Sampedro , Mrinal Kalakrishnan

IPC: B25J9/16 , B25J13/08

Abstract: Techniques are described herein for robotic control using value distributions. In various implementations, as part of performing a robotic task, state data associated with the robot in an environment may be generated based at least in part on vision data captured by a vision component of the robot. A plurality of candidate actions may be sampled, e.g., from continuous action space. A trained critic neural network model that represents a learned value function may be used to process a plurality of state-action pairs to generate a corresponding plurality of value distributions. Each state-action pair may include the state data and one of the plurality of sampled candidate actions. The state-action pair corresponding to the value distribution that satisfies one or more criteria may be selected from the plurality of state-action pairs. The robot may then be controlled to implement the sampled candidate action of the selected state-action pair.

2.

发明申请
LEARNING ROBOTIC SKILLS WITH IMITATION AND REINFORCEMENT AT SCALE 有权

公开(公告)号：US20220410380A1

公开(公告)日：2022-12-29

申请号：US17843288

申请日：2022-06-17

Applicant: X Development LLC

Inventor： Yao Lu , Mengyuan Yan , Seyed Mohammad Khansari Zadeh , Alexander Herzog , Eric Jang , Karol Hausman , Yevgen Chebotar , Sergey Levine , Alexander Irpan

IPC: B25J9/16

Abstract: Utilizing an initial set of offline positive-only robotic demonstration data for pre-training an actor network and a critic network for robotic control, followed by further training of the networks based on online robotic episodes that utilize the network(s). Implementations enable the actor network to be effectively pre-trained, while mitigating occurrences of and/or the extent of forgetting when further trained based on episode data. Implementations additionally or alternatively enable the actor network to be trained to a given degree of effectiveness in fewer training steps. In various implementations, one or more adaptation techniques are utilized in performing the robotic episodes and/or in performing the robotic training. The adaptation techniques can each, individually, result in one or more corresponding advantages and, when used in any combination, the corresponding advantages can accumulate. The adaptation techniques include Positive Sample Filtering, Adaptive Exploration, Using Max Q Values, and Using the Actor in CEM.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification