Sparse Kernel-Based Least Squares Temporal Difference with Prioritized Sweeping.

Cijia Sun,Xinghong Ling,Yuchen Fu,Quan Liu,Haijun Zhu,Jianwei Zhai,Peng Zhang

Lecture Notes in Computer Science（2016）

引用 0|浏览27

暂无评分

摘要

How to improve the efficiency of the algorithms to solve the large scale or continuous space reinforcement learning (RL) problems has been a hot research. Kernel-based least squares temporal difference(KLSTD) algorithm can solve continuous space RL problems. But it has the problem of high computational complexity because of kernel-based and complex matrix computation. For the problem, this paper proposes an algorithm named sparse kernel-based least squares temporal difference with prioritized sweeping (PS-SKLSTD). PS-SKLSTD consists of two parts: learning and planning. In the learning process, we exploit the ALD-based sparse kernel function to represent value function and update the parameter vectors based on the Sherman-Morrison equation. In the planning process, we use prioritized sweeping method to select the current updated state-action pair. The experimental results demonstrate that PS-SKLSTD has better performance on convergence and calculation efficiency than KLSTD.

查看译文

关键词

Reinforcement learning,Prioritized sweeping,Sparse kernel,Least squares temporal difference

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要