Improved value iteration for neural-network-based stochastic optimal control design.

Neural networks : the official journal of the International Neural Network Society(2020)

引用 14|浏览49
暂无评分
摘要
In this paper, a novel value iteration adaptive dynamic programming (ADP) algorithm is presented, which is called an improved value iteration ADP algorithm, to obtain the optimal policy for discrete stochastic processes. In the improved value iteration ADP algorithm, for the first time we propose a new criteria to verify whether the obtained policy is stable or not for stochastic processes. By analyzing the convergence properties of the proposed algorithm, it is shown that the iterative value functions can converge to the optimum. In addition, our algorithm allows the initial value function to be an arbitrary positive semi-definite function. Finally, two simulation examples are presented to validate the effectiveness of the developed method.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要