Quantitative controller synthesis for consumption Markov decision processes

Information Processing Letters(2023)

引用 0|浏览21
暂无评分
摘要
Consumption Markov decision processes (CMDPs) are a kind of resource-constrained probabilistic decision-making systems allowing the consumption and replenishment to capacity of resource. Making each decision in CMDPs consumes some amount of resource, which can be reloaded to the full capacity in a specified set of states. We consider here the quantitative reachability probability controller synthesis problems for CMDPs that extract a policy preventing resource exhaustion and achieve (repeated) reachability goals with the maximal probability, in contrast to the original work focusing on the qualitative controller synthesis for CMDPs, e.g., fulfilling the (repeated) reachability goals with positive probability or probability one. We first prove that the exact quantitative reachability probability problems are NP-hard by a reduction from the 0–1 Knapsack problem. We solve them by converting a given CMDP to a flattened Markov decision process (MDP) and then constructing an optimal policy for that MDP. Unfortunately, the naive flattened MDPs are prohibitively large and thus make the subsequent synthesis not scale well. To overcome this drawback, we propose a pruned construction and a quotient construction of the flattened MDP to reduce the sizes of MDPs. The empirical evaluation shows that our optimizations can significantly improve the scalability of the purely flattening-based synthesis method for CMDPs on benchmarks from real-world scenarios — autonomous electric vehicle routing and helicopter planning in Mars Perseverance Rover project.
更多
查看译文
关键词
Markov decision process,Scheduling,Quantitative analysis,Reachability,NP-hardness
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要