Abstract
Although Partially Observable Stochastic Games (POSGs) provide a powerful mathematical paradigm for modeling multi-agent dynamic decision making under uncertainty and partial information, they are notoriously hard to solve (e.g., the common-payoff POSGs are NEXP-complete) and have an extensive data requirement on each agent. The latter may represent a serious challenge to a defending agent if he/she has limited knowledge of its adversary. A worst-case analysis can significantly reduce both model computational complexity and data requirements regarding the adversary; further, a (near) optimal worst-case policy may represent a useful guide for action selection for risk-averse defenders (e.g., benchmarks). This article introduces a worst-case analysis to a leader–follower POSG where: (i) the defending leader has little knowledge of the adversarial follower’s reward structure, level of rationality, and process for gathering and transmitting data relevant for decision making; (ii) the objective is to determine a best worst-case value function and a control strategy for the leader. We show that the worst-case assumption transforms this POSG into a more computationally tractable single-agent problem with a simple sufficient statistic. However, the value function can be non-convex, in contrast with the value function of a partially observable Markov decision process. We design an iterative solution procedure for computing a lower bound of the leader’s value function and its control policy for the finite horizon case. This approach was numerically illustrated to support decision making in a security example.
Additional information
Notes on contributors
Yanling Chang
Yanling Chang is an assistant professor in the Department of Engineering Technology and Industrial Distribution and the Department of Industrial and Systems Engineering, Texas A&M University, College Station, TX, USA. She received her Bachelor’s degree in Electronic and Information Science and Technology from Peking University, Beijing, China; the Master's degree in Mathematics and the PhD degree in Operations Research both from Georgia Institute of Technology, Atlanta, GA, USA.
Chelsea C. White
Chelsea C. White received his PhD from the University of Michigan (UM) in Computer, Information, and Control Engineering. He has served as School Chair of the H. Milton Stewart School of Industrial & Systems Engineering (2005 -10) and holds the Schneider National Chair of Transportation and Logistics at Georgia Tech, where he is the former Director of the A.P. Sloan Foundation Trucking Industry Program and of The Logistics Institute. While at the University of Michigan, he was the founding Engineering Co-Director of what is now the Tauber Institute for Global Operations. He is a Fellow of the IEEE, a Fellow of INFORMS, and an INFORMS Edelman Laureate. He is a former member of the World Economic Forum Trade Facilitation Council.