Sparse Kernel-Based Least Squares Temporal Difference with Prioritized Sweeping.
Lecture Notes in Computer Science(2016)
摘要
How to improve the efficiency of the algorithms to solve the large scale or continuous space reinforcement learning (RL) problems has been a hot research. Kernel-based least squares temporal difference(KLSTD) algorithm can solve continuous space RL problems. But it has the problem of high computational complexity because of kernel-based and complex matrix computation. For the problem, this paper proposes an algorithm named sparse kernel-based least squares temporal difference with prioritized sweeping (PS-SKLSTD). PS-SKLSTD consists of two parts: learning and planning. In the learning process, we exploit the ALD-based sparse kernel function to represent value function and update the parameter vectors based on the Sherman-Morrison equation. In the planning process, we use prioritized sweeping method to select the current updated state-action pair. The experimental results demonstrate that PS-SKLSTD has better performance on convergence and calculation efficiency than KLSTD.
更多查看译文
关键词
Reinforcement learning,Prioritized sweeping,Sparse kernel,Least squares temporal difference
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要