Invention Grant
- Patent Title: Inverse reinforcement learning by density ratio estimation
-
Application No.: US15329690Application Date: 2015-08-07
-
Publication No.: US10896382B2Publication Date: 2021-01-19
- Inventor: Eiji Uchibe , Kenji Doya
- Applicant: Okinawa Institute of Science and Technology School Corporation
- Applicant Address: JP Okinawa
- Assignee: Okinawa Institute of Science and Technology School Corporation
- Current Assignee: Okinawa Institute of Science and Technology School Corporation
- Current Assignee Address: JP Okinawa
- Agency: Westerman, Hattori, Daniels & Adrian, LLP
- International Application: PCT/JP2015/004001 WO 20150807
- International Announcement: WO2016/021210 WO 20160211
- Main IPC: G06N20/00
- IPC: G06N20/00 ; G06N7/00

Abstract:
A method of inverse reinforcement learning for estimating cost and value functions of behaviors of a subject includes acquiring data representing changes in state variables that define the behaviors of the subject; applying a modified Bellman equation given by Eq. (1) to the acquired data: q(x)+gV(y)−V(x)=−ln{pi(y|x))/(p(y|x)} (1) where q(x) and V(x) denote a cost function and a value function, respectively, at state x, g represents a discount factor, and p(y|x) and pi(y|x) denote state transition probabilities before and after learning, respectively; estimating a density ratio pi(y|x)/p(y|x) in Eq. (1); estimating q(x) and V(x) in Eq. (1) using the least square method in accordance with the estimated density ratio pi(y|x)/p(y|x), and outputting the estimated q(x) and V(x).
Public/Granted literature
- US20170213151A1 INVERSE REINFORCEMENT LEARNING BY DENSITY RATIO ESTIMATION Public/Granted day:2017-07-27
Information query