-
公开(公告)号:US12061481B2
公开(公告)日:2024-08-13
申请号:US17291540
申请日:2019-11-27
Applicant: GOOGLE LLC
Inventor: Alexander Toshev , Marek Fiser , Ayzaan Wahid
CPC classification number: G05D1/0221
Abstract: Training and/or using both a high-level policy model and a low-level policy model for mobile robot navigation. High-level output generated using the high-level policy model at each iteration indicates a corresponding high-level action for robot movement in navigating to the navigation target. The low-level output generated at each iteration is based on the determined corresponding high-level action for that iteration, and is based on observation(s) for that iteration. The low-level policy model is trained to generate low-level output that defines low-level action(s) that define robot movement more granularly than the high-level action—and to generate low-level action(s) that avoid obstacles and/or that are efficient (e.g., distance and/or time efficiency).
-
公开(公告)号:US20210397195A1
公开(公告)日:2021-12-23
申请号:US17291540
申请日:2019-11-27
Applicant: GOOGLE LLC
Inventor: Alexander Toshev , Marek Fiser , Ayzaan Wahid
IPC: G05D1/02
Abstract: Training and/or using both a high-level policy model and a low-level policy model for mobile robot navigation. High-level output generated using the high-level policy model at each iteration indicates a corresponding high-level action for robot movement in navigating to the navigation target. The low-level output generated at each iteration is based on the determined corresponding high-level action for that iteration, and is based on observation(s) for that iteration. The low-level policy model is trained to generate low-level output that defines low-level action(s) that define robot movement more granularly than the high-level action—and to generate low-level action(s) that avoid obstacles and/or that are efficient (e.g., distance and/or time efficiency).
-
公开(公告)号:US20210325894A1
公开(公告)日:2021-10-21
申请号:US17275459
申请日:2019-09-13
Applicant: Google LLC
Inventor: Aleksandra Faust , Hao-tien Chiang , Anthony Francis , Marek Fiser
Abstract: Using reinforcement learning to train a policy network that can be utilized, for example, by a robot in performing robot navigation and/or other robotic tasks. Various implementations relate to techniques for automatically learning a reward function for training of a policy network through reinforcement learning, and automatically learning a neural network architecture for the policy network.
-
公开(公告)号:US11941504B2
公开(公告)日:2024-03-26
申请号:US17040299
申请日:2019-03-22
Applicant: Google LLC
Inventor: Pararth Shah , Dilek Hakkani-Tur , Juliana Kew , Marek Fiser , Aleksandra Faust
IPC: G06N3/008 , B25J9/16 , B25J13/08 , G05B13/02 , G05D1/00 , G05D1/02 , G06F18/21 , G06N3/044 , G06T7/593 , G06V20/10 , G06V30/262 , G10L15/16 , G10L15/18 , G10L15/22 , G10L25/78
CPC classification number: G06N3/008 , B25J9/161 , B25J9/162 , B25J9/163 , B25J9/1697 , B25J13/08 , G05B13/027 , G05D1/0221 , G06F18/21 , G06N3/044 , G06T7/593 , G06V20/10 , G06V30/274 , G10L15/16 , G10L15/1815 , G10L15/22 , G10L25/78 , G10L2015/223
Abstract: Implementations relate to using deep reinforcement learning to train a model that can be utilized, at each of a plurality of time steps, to determine a corresponding robotic action for completing a robotic task. Implementations additionally or alternatively relate to utilization of such a model in controlling a robot. The robotic action determined at a given time step utilizing such a model can be based on: current sensor data associated with the robot for the given time step, and free-form natural language input provided by a user. The free-form natural language input can direct the robot to accomplish a particular task, optionally with reference to one or more intermediary steps for accomplishing the particular task. For example, the free-form natural language input can direct the robot to navigate to a particular landmark, with reference to one or more intermediary landmarks to be encountered in navigating to the particular landmark.
-
公开(公告)号:US20210086353A1
公开(公告)日:2021-03-25
申请号:US17040299
申请日:2019-03-22
Applicant: Google LLC
Inventor: Pararth Shah , Dilek Hakkani-Tur , Juliana Kew , Marek Fiser , Aleksandra Faust
IPC: B25J9/16 , G10L25/78 , G10L15/22 , G10L15/18 , G06K9/00 , G06K9/62 , G10L15/16 , G06T7/593 , G06K9/72 , B25J13/08 , G05D1/02 , G05B13/02 , G06N3/04
Abstract: Implementations relate to using deep reinforcement learning to train a model that can be utilized, at each of a plurality of time steps, to determine a corresponding robotic action for completing a robotic task. Implementations additionally or alternatively relate to utilization of such a model in controlling a robot. The robotic action determined at a given time step utilizing such a model can be based on: current sensor data associated with the robot for the given time step, and free-form natural language input provided by a user. The free-form natural language input can direct the robot to accomplish a particular task, optionally with reference to one or more intermediary steps for accomplishing the particular task. For example, the free-form natural language input can direct the robot to navigate to a particular landmark, with reference to one or more intermediary landmarks to be encountered in navigating to the particular landmark.
-
公开(公告)号:US20240249109A1
公开(公告)日:2024-07-25
申请号:US18601159
申请日:2024-03-11
Applicant: GOOGLE LLC
Inventor: Pararth Shah , Dilek Hakkani-Tur , Juliana Kew , Marek Fiser , Aleksandra Faust
IPC: G06N3/008 , B25J9/16 , B25J13/08 , G05B13/02 , G06F18/21 , G06N3/044 , G06T7/593 , G06V20/10 , G06V30/262 , G10L15/16 , G10L15/18 , G10L15/22 , G10L25/78
CPC classification number: G06N3/008 , B25J9/161 , B25J9/162 , B25J9/163 , B25J9/1697 , B25J13/08 , G05B13/027 , G06F18/21 , G06N3/044 , G06T7/593 , G06V20/10 , G06V30/274 , G10L15/16 , G10L15/1815 , G10L15/22 , G10L25/78 , G10L2015/223
Abstract: Implementations relate to using deep reinforcement learning to train a model that can be utilized, at each of a plurality of time steps, to determine a corresponding robotic action for completing a robotic task. Implementations additionally or alternatively relate to utilization of such a model in controlling a robot. The robotic action determined at a given time step utilizing such a model can be based on: current sensor data associated with the robot for the given time step, and free-form natural language input provided by a user. The free-form natural language input can direct the robot to accomplish a particular task, optionally with reference to one or more intermediary steps for accomplishing the particular task. For example, the free-form natural language input can direct the robot to navigate to a particular landmark, with reference to one or more intermediary landmarks to be encountered in navigating to the particular landmark.
-
公开(公告)号:US11972339B2
公开(公告)日:2024-04-30
申请号:US17040299
申请日:2019-03-22
Applicant: Google LLC
Inventor: Pararth Shah , Dilek Hakkani-Tur , Juliana Kew , Marek Fiser , Aleksandra Faust
IPC: G06N3/008 , B25J9/16 , B25J13/08 , G05B13/02 , G05D1/00 , G05D1/02 , G06F18/21 , G06N3/044 , G06T7/593 , G06V20/10 , G06V30/262 , G10L15/16 , G10L15/18 , G10L15/22 , G10L25/78
CPC classification number: G06N3/008 , B25J9/161 , B25J9/162 , B25J9/163 , B25J9/1697 , B25J13/08 , G05B13/027 , G05D1/0221 , G06F18/21 , G06N3/044 , G06T7/593 , G06V20/10 , G06V30/274 , G10L15/16 , G10L15/1815 , G10L15/22 , G10L25/78 , G10L2015/223
Abstract: Implementations relate to using deep reinforcement learning to train a model that can be utilized, at each of a plurality of time steps, to determine a corresponding robotic action for completing a robotic task. Implementations additionally or alternatively relate to utilization of such a model in controlling a robot. The robotic action determined at a given time step utilizing such a model can be based on: current sensor data associated with the robot for the given time step, and free-form natural language input provided by a user. The free-form natural language input can direct the robot to accomplish a particular task, optionally with reference to one or more intermediary steps for accomplishing the particular task. For example, the free-form natural language input can direct the robot to navigate to a particular landmark, with reference to one or more intermediary landmarks to be encountered in navigating to the particular landmark.
-
-
-
-
-
-