A Probabilistic Forward Search Value Iteration Algorithm for POMDP

2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI)(2019)

引用 0|浏览27
暂无评分
摘要
Point-based value iteration methods are a class of practical algorithms for solving the POMDP model. The critical process of these methods is the exploration of the belief point set B. Forward search value iteration(FSVI) can reduce the complexity and improve efficiency significantly by using the optimal strategy of the underlying MDP. However, it does not utilize the observations of the model, making it not so efficient in the large-scale POMDP problems. A probabilistic forwardsearching value iteration algorithm (PFSVI) is presented in the paper to make up the shortage of FSVI. During the exploration, PFSVI uses the alias method to sample the action a* based on weighted Q MDP function and sample the state based on b and the transition function. Then, PFSVI selects the observation z, which lets the successor point b a*,z farthest from B. PFSVI can improve the effect by sampling according to the environment and reaching more vast space than FSVI. Experiment results of four benchmarks show that PFSVI can achieve better global optimal solutions than FSVI and PBVI, especially in large-scale problems.
更多
查看译文
关键词
POMDP, PFSVI, Alias Method
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要