-
公开(公告)号:US12037013B1
公开(公告)日:2024-07-16
申请号:US17515244
申请日:2021-10-29
Applicant: Zoox, Inc.
Inventor: Gary Linscott , Andreas Pasternak , Jefferson Bradfield Packer , Marin Kobilarov
CPC classification number: B60W60/0011 , B60W60/00272 , G06F11/3452 , G06F11/3457 , G06N20/00
Abstract: Automating reinforcement learning for autonomous vehicles may include assigning a probability with a scenario and varying that probability based at least in part on changes in performance by the autonomous vehicle associated with that scenario. The amount of time and computational bandwidth required to train a machine-learned component of an autonomous vehicle and the accuracy of the machine-learned component may be improved by determining a reward for performance of the autonomous vehicle in a scenario based at least in part on an severity metric. The impact severity metric may be determined based at least in part on a velocity, angle, and/or interaction area associated with the impact.
-
公开(公告)号:US11891088B1
公开(公告)日:2024-02-06
申请号:US17347088
申请日:2021-06-14
Applicant: Zoox, Inc.
Inventor: Marin Kobilarov , Jefferson Bradfield Packer , Gowtham Garimella , Andreas Pasternak , Yiteng Zhang , Ruikun Yu
CPC classification number: B60W60/0015 , G06N20/00 , G07C5/008 , G07C5/0808 , B60W2050/0075 , B60W2554/404 , B60W2556/45
Abstract: A reward determined as part of a machine learning technique, such as reinforcement learning, may be used to control an adversarial agent in a simulation such that a component for controlling motion of the adversarial agent is trained to reduce the reward. Training the adversarial agent component may be subject to one or more constraints and/or may be balanced against one or more additional goals. Additionally or alternatively, the reward may be used to alter scenario data so that the scenario data reduces the reward, allowing the discovery of difficult scenarios and/or prospective events.
-