Finite-Horizon Constrained MDPs With Both Additive And Multiplicative Utilities

arxiv(2023)

引用 0|浏览5
暂无评分
摘要
This paper considers the problem of finding a solution to the finite horizon constrained Markov decision processes (CMDP) where the objective as well as constraints are sum of additive and multiplicative utilities. Towards solving this, we construct another CMDP, with only additive utilities under a restricted set of policies, whose optimal value is equal to that of the original CMDP. Furthermore, we provide a finite dimensional bilinear program (BLP) whose value equals the CMDP value and whose solution provides the optimal policy. We also suggest an algorithm to solve this BLP.
更多
查看译文
关键词
mdps,additive,finite-horizon
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要