Invention Grant
US08545332B2 Optimal policy determination using repeated stackelberg games with unknown player preferences
失效
最佳策略确定使用重复的堆栈游戏与未知的玩家偏好
- Patent Title: Optimal policy determination using repeated stackelberg games with unknown player preferences
- Patent Title (中): 最佳策略确定使用重复的堆栈游戏与未知的玩家偏好
-
Application No.: US13364843Application Date: 2012-02-02
-
Publication No.: US08545332B2Publication Date: 2013-10-01
- Inventor: Janusz Marecki , Richard B. Segal , Gerald J. Tesauro
- Applicant: Janusz Marecki , Richard B. Segal , Gerald J. Tesauro
- Applicant Address: US NY Armonk
- Assignee: International Business Machines Corporation
- Current Assignee: International Business Machines Corporation
- Current Assignee Address: US NY Armonk
- Agency: Scully, Scott, Murphy & Presser, P.C.
- Agent Daniel P. Morris, Esq.
- Main IPC: A63F13/00
- IPC: A63F13/00

Abstract:
A system, method and computer program product for planning actions in a repeated Stackelberg Game, played for a fixed number of rounds, where the payoffs or preferences of the follower are initially unknown to the leader, and a prior probability distribution over follower types is available. In repeated Bayesian Stackelberg games, the objective is to maximize the leader's cumulative expected payoff over the rounds of the game. The optimal plans in such games make intelligent tradeoffs between actions that reveal information regarding the unknown follower preferences, and actions that aim for high immediate payoff. The method solves for such optimal plans according to a Monte Carlo Tree Search method wherein simulation trials draw instances of followers from said prior probability distribution. Some embodiments additionally implement a method for pruning dominated leader strategies.
Public/Granted literature
- US20130204412A1 OPTIMAL POLICY DETERMINATION USING REPEATED STACKELBERG GAMES WITH UNKNOWN PLAYER PREFERENCES Public/Granted day:2013-08-08
Information query