Abstract:
Efficient heuristic methods are described for approximating the optimal leader strategy for security domains where threats come from unknown adversaries. These problems can be modeled as Bayes-Stackelberg games. An embodiment of the heuristic method can include defining a patrolling or security domain problem as a mixed-integer quadratic program. The mixed-integer quadratic program can be converted to a mixed-integer linear program. For a single follower (e.g., robber or terrorist) scenario, the mixed-integer linear program can be solved, subject to appropriate constraints. For embodiments applicable to multiple follower situations, the relevant mixed-integer quadratic program and related mixed-integer linear program can be decomposed, e.g., by changing the response function for the follower from a pure strategy to a weighted combination over various pure follower strategies where the weights are probabilities of occurrence of each of the follower types.
Abstract:
Efficient heuristic methods are described for approximating the optimal leader strategy for security domains where threats come from unknown adversaries. These problems can be modeled as Bayes-Stackelberg games. An embodiment of the heuristic method can include defining a patrolling or security domain problem as a mixed-integer quadratic program. The mixed-integer quadratic program can be converted to a mixed-integer linear program. For a single follower (e.g., robber or terrorist) scenario, the mixed-integer linear program can be solved, subject to appropriate constraints. For embodiments applicable to multiple follower situations, the relevant mixed-integer quadratic program and related mixed-integer linear program can be decomposed, e.g., by changing the response function for the follower from a pure strategy to a weighted combination over various pure follower strategies where the weights are probabilities of occurrence of each of the follower types.
Abstract:
Techniques are described for Stackelberg games, in which one agent (the leader) must commit to a strategy that can be observed by other agents (the followers or adversaries) before they choose their own strategies, in which the leader is uncertain about the types of adversaries it may face. Such games are important in security domains, where, for example, a security agent (leader) must commit to a strategy of patrolling certain areas, and robbers (followers) have a chance to observe this strategy over time before choosing their own strategies of where to attack. An efficient exact algorithm is described for finding the optimal strategy for the leader to commit to in these games. This algorithm, Decomposed Optimal Bayesian Stackelberg Solver or “DOBSS,” is based on a novel and compact mixed-integer linear programming formulation. The algorithm can be implemented in a method, software, and/or system including computer or processor functionality.
Abstract:
System, method and computer program product for modelling Risk-Sensitive Partially-Observable Markov Decision Processes (POMDPs), e.g., in a high-risk domain such as financial planning and solving such equations exactly, such that agents maximize the expected utility of their actions. The system and method employs an exact algorithm for solving Risk-Sensitive POMDPs, for piecewise linear utility functions, by representing underlying value functions with sets of piecewise bilinear functions—computed using functional value iteration—and pruning the dominated bilinear functions using efficient linear programming approximations of underlying non-convex bilinear programs. Considering piecewise linear approximations of utility functions, (i) there is defined the Risk-Sensitive POMDP model that incorporates value functions V(b,w) where argument “b” is a belief state and argument “w” is a continuous wealth dimension; (ii) derive the fundamental properties of the underlying value functions and provide a functional value iteration technique to compute them; and (iii) determine the dominated value functions, to speed up the algorithm.
Abstract:
A system, method and computer program product for planning actions in a repeated Stackelberg Game, played for a fixed number of rounds, where the payoffs or preferences of the follower are initially unknown to the leader, and a prior probability distribution over follower types is available. In repeated Bayesian Stackelberg games, the objective is to maximize the leader's cumulative expected payoff over the rounds of the game. The optimal plans in such games make intelligent tradeoffs between actions that reveal information regarding the unknown follower preferences, and actions that aim for high immediate payoff. The method solves for such optimal plans according to a Monte Carlo Tree Search method wherein simulation trials draw instances of followers from said prior probability distribution. Some embodiments additionally implement a method for pruning dominated leader strategies.
Abstract:
Techniques are described for Stackelberg games, in which one agent (the leader) must commit to a strategy that can be observed by other agents (the followers or adversaries) before they choose their own strategies, in which the leader is uncertain about the types of adversaries it may face. Such games are important in security domains, where, for example, a security agent (leader) must commit to a strategy of patrolling certain areas, and robbers (followers) have a chance to observe this strategy over time before choosing their own strategies of where to attack. An efficient exact algorithm is described for finding the optimal strategy for the leader to commit to in these games. This algorithm, Decomposed Optimal Bayesian Stackelberg Solver or “DOBSS,” is based on a novel and compact mixed-integer linear programming formulation. The algorithm can be implemented in a method, software, and/or system including computer or processor functionality.
Abstract:
System, method and computer program product for modelling information sharing domains as Partially Observable Markov Decision Processes (POMDP), and that provides solutions that view the information sharing as a sequential process where the trustworthiness of the information recipients is monitored using data leakage detection mechanisms. In one embodiment, the system, method and computer program product performs (i) formulating information sharing decisions using Partially Observable Markov Decision Processes combined with a digital watermarking leakage detection mechanism, and (ii) deriving optimal information sharing strategies for the sender and optimal information leakage strategies for a recipient as a function of the efficacy of the underlying monitoring mechanism. By employing POMDPs in information sharing domains, users (senders) can maximize the expected reward of their data/information sharing actions.
Abstract:
A system, method and computer program product for planning actions in a repeated Stackelberg Game, played for a fixed number of rounds, where the payoffs or preferences of the follower are initially unknown to the leader, and a prior probability distribution over follower types is available. In repeated Bayesian Stackelberg games, the objective is to maximize the leader's cumulative expected payoff over the rounds of the game. The optimal plans in such games make intelligent tradeoffs between actions that reveal information regarding the unknown follower preferences, and actions that aim for high immediate payoff. The method solves for such optimal plans according to a Monte Carlo Tree Search method wherein simulation trials draw instances of followers from said prior probability distribution. Some embodiments additionally implement a method for pruning dominated leader strategies.
Abstract:
Efficient heuristic methods are described for approximating the optimal leader strategy for security domains where threats come from unknown adversaries. These problems can be modeled as Bayes-Stackelberg games. An embodiment of the heuristic method can include defining a patrolling or security domain problem as a mixed-integer quadratic program. The mixed-integer quadratic program can be converted to a mixed-integer linear program. For a single follower (e.g., robber or terrorist) scenario, the mixed-integer linear program can be solved, subject to appropriate constraints. For embodiments applicable to multiple follower situations, the relevant mixed-integer quadratic program and related mixed-integer linear program can be decomposed, e.g., by changing the response function for the follower from a pure strategy to a weighted combination over various pure follower strategies where the weights are probabilities of occurrence of each of the follower types.
Abstract:
Efficient heuristic methods are described for approximating the optimal leader strategy for security domains where threats come from unknown adversaries. These problems can be modeled as Bayes-Stackelberg games. An embodiment of the heuristic method can include defining a patrolling or security domain problem as a mixed-integer quadratic program. The mixed-integer quadratic program can be converted to a mixed-integer linear program. For a single follower (e.g., robber or terrorist) scenario, the mixed-integer linear program can be solved, subject to appropriate constraints. For embodiments applicable to multiple follower situations, the relevant mixed-integer quadratic program and related mixed-integer linear program can be decomposed, e.g., by changing the response function for the follower from a pure strategy to a weighted combination over various pure follower strategies where the weights are probabilities of occurrence of each of the follower types.