TRAINING A POLICY MODEL FOR A ROBOTIC TASK, USING REINFORCEMENT LEARNING AND UTILIZING DATA THAT IS BASED ON EPISODES, OF THE ROBOTIC TASK, GUIDED BY AN ENGINEERED POLICY

Invention Application

US20220245503A1 TRAINING A POLICY MODEL FOR A ROBOTIC TASK, USING REINFORCEMENT LEARNING AND UTILIZING DATA THAT IS BASED ON EPISODES, OF THE ROBOTIC TASK, GUIDED BY AN ENGINEERED POLICY 有权

Please log in to see more content

Patent Title: TRAINING A POLICY MODEL FOR A ROBOTIC TASK, USING REINFORCEMENT LEARNING AND UTILIZING DATA THAT IS BASED ON EPISODES, OF THE ROBOTIC TASK, GUIDED BY AN ENGINEERED POLICY
Application No.: US17161845

Application Date: 2021-01-29
Publication No.: US20220245503A1

Publication Date: 2022-08-04
Inventor: Adrian Li , Benjamin Holson , Alexander Herzog , Mrinal Kalakrishnan
Applicant: X Development LLC
Applicant Address: US CA Mountain View
Assignee: X Development LLC
Current Assignee: X Development LLC
Current Assignee Address: US CA Mountain View
Main IPC: G06N20/00
IPC: G06N20/00 ; G06N3/00 ; G06N5/04

TRAINING A POLICY MODEL FOR A ROBOTIC TASK, USING REINFORCEMENT LEARNING AND UTILIZING DATA THAT IS BASED ON EPISODES, OF THE ROBOTIC TASK, GUIDED BY AN ENGINEERED POLICY

Abstract:

Implementations disclosed herein relate to utilizing at least one existing manually engineered policy, for a robotic task, in training an RL policy model that can be used to at least selectively replace a portion of the engineered policy. The RL policy model can be trained for replacing a portion of a robotic task and can be trained based on data from episodes of attempting performance of the robotic task, including episodes in which the portion is performed based on the engineered policy and/or other portion(s) are performed based on the engineered policy. Once trained, the RL policy model can be used, at least selectively and in lieu of utilization of the engineered policy, to perform the portion of robotic task, while other portion(s) of the robotic task are performed utilizing the engineered policy and/or other similarly trained (but distinct) RL policy model(s).

Public/Granted literature

US12210943B2 Training a policy model for a robotic task, using reinforcement learning and utilizing data that is based on episodes, of the robotic task, guided by an engineered policy Public/Granted day:2025-01-28

Information query

Global Dossier Espacenet

IPC分类:

G	物理
G06	计算；推算或计数
G06N	基于特定计算模型的计算机系统
G06N20/00	机器学习