Partitioned external-memory value iteration

AAAI(2008)

引用 27|浏览31
暂无评分
摘要
Dynamic programming methods (including value iteration, LAO*, RTDP, and derivatives) are popular algorithms for solving Markov decision processes (MDPs). Unfortunately, however, these techniques store the MDP model extensionally in a table and thus are limited by the amount of main memory available. Since the required space is exponential in the number of domain features, these dynamic programming methods are ineffective for large problems. To address this problem, Edelcamp et al. devised the external memory value iteration (EMVI) algorithm, which uses a clever sorting scheme to efficiently move parts of the model between disk and main memory. While EMVI can handle larger problems than previously addressed, the need to repeatedly perform external sorts still limits scalability. This paper proposes a new approach. We partition an MDP into smaller pieces (blocks), keeping just the relevant blocks in memory and performing Bellman backups block by block. Experiments show that our algorithm is able to solve large MDPs an order of magnitude faster than EMVI.
更多
查看译文
关键词
mdp model extensionally,external memory value iteration,bellman backups block,dynamic programming method,large mdps,large problem,partitioned external-memory value iteration,external sort,relevant block,popular algorithm,main memory,markov decision process,value iteration,external memory
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要