Invention Application
- Patent Title: TRAINING A POLICY MODEL FOR A ROBOTIC TASK, USING REINFORCEMENT LEARNING AND UTILIZING DATA THAT IS BASED ON EPISODES, OF THE ROBOTIC TASK, GUIDED BY AN ENGINEERED POLICY
-
Application No.: US17161845Application Date: 2021-01-29
-
Publication No.: US20220245503A1Publication Date: 2022-08-04
- Inventor: Adrian Li , Benjamin Holson , Alexander Herzog , Mrinal Kalakrishnan
- Applicant: X Development LLC
- Applicant Address: US CA Mountain View
- Assignee: X Development LLC
- Current Assignee: X Development LLC
- Current Assignee Address: US CA Mountain View
- Main IPC: G06N20/00
- IPC: G06N20/00 ; G06N3/00 ; G06N5/04

Abstract:
Implementations disclosed herein relate to utilizing at least one existing manually engineered policy, for a robotic task, in training an RL policy model that can be used to at least selectively replace a portion of the engineered policy. The RL policy model can be trained for replacing a portion of a robotic task and can be trained based on data from episodes of attempting performance of the robotic task, including episodes in which the portion is performed based on the engineered policy and/or other portion(s) are performed based on the engineered policy. Once trained, the RL policy model can be used, at least selectively and in lieu of utilization of the engineered policy, to perform the portion of robotic task, while other portion(s) of the robotic task are performed utilizing the engineered policy and/or other similarly trained (but distinct) RL policy model(s).
Public/Granted literature
Information query