Finite- and Infinite-Horizon Shapley Games with Nonsymmetric Partial Observation.

SIAM JOURNAL ON CONTROL AND OPTIMIZATION(2015)

引用 9|浏览27
暂无评分
摘要
We consider asymmetric partially observed Shapley-type finite-horizon and infinite-horizon games where the state, a controlled Markov chain {X-t}, is not observable to one player (minimizer) who observes only a state-dependent signal {Y-t}. The maximizer observes both. The minimizer is informed of the maximizer's action after (before) choosing his control in the MINMAX (MAXMIN) game. A nontrivial open problem in such situations is how the minimizer can use this knowledge to update his belief about {X-t}. To address this, the maximizer uses off-line control functions which are known to the minimizer. Using these, novel control-parameterized nonlinear filters are constructed which are proved to characterize the conditional distribution of the full path of {X-t}. Using these filters, recursive algorithms are developed which show that saddle-points exist in both behavioral and Markov strategies for the finite-horizon case in both games. These algorithms are extended to prove saddle-points in Markov strategies for both games for the infinite-horizon case. A counterexample shows that the finite-horizon MINMAX value may be greater than the MAXMIN value. We show that the asymptotic limits of these values converge to the corresponding MINMAX and MAXMIN saddle-point values in the infinite-horizon setup. Another counterexample shows that the uniform value need not exist.
更多
查看译文
关键词
nonsymmetric partially observed game,parameterized filtering,zero-sum stochastic game,finite-horizon and infinite-horizon discounted cost,dynamic programming algorithms
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要