Robotic control using value distributions

    公开(公告)号:US11571809B1

    公开(公告)日:2023-02-07

    申请号:US17017920

    申请日:2020-09-11

    Abstract: Techniques are described herein for robotic control using value distributions. In various implementations, as part of performing a robotic task, state data associated with the robot in an environment may be generated based at least in part on vision data captured by a vision component of the robot. A plurality of candidate actions may be sampled, e.g., from continuous action space. A trained critic neural network model that represents a learned value function may be used to process a plurality of state-action pairs to generate a corresponding plurality of value distributions. Each state-action pair may include the state data and one of the plurality of sampled candidate actions. The state-action pair corresponding to the value distribution that satisfies one or more criteria may be selected from the plurality of state-action pairs. The robot may then be controlled to implement the sampled candidate action of the selected state-action pair.

Patent Agency Ranking