Adaptive Fuzzy Watkins: A New Adaptive Approach for Eligibility Traces in Reinforcement Learning

Matin Shokri,Seyed Hossein Khasteh,Amin Aminifar

International Journal of Fuzzy Systems（2019）

引用 0|浏览4

暂无评分

摘要

Reinforcement learning is one of the most reliable methods, which have been used to solve many problems. One of the best reinforcement learning family methods are temporal difference methods. The most important weakness of reinforcement learning methods, such as temporal difference methods, is that these methods have slow convergence rate. Many studies are devoted to solving this problem. One of the proposed solutions to this problem is eligibility traces. Owing to the nature of off-policy methods, combining eligibility traces with off-policy methods requires special attention. In the early learning process for Watkins method (one of the dominant eligibility traces methods), cutting eligibility traces during exploratory actions results in diminishing benefits of eligibility traces method. In this study, we propose a framework to combine eligibility traces with off-policy methods. This research attempts to properly use the information explored during action exploration of the agent; to this end, the decision about applying the eligibility traces during the exploratory actions of the agent is made by means of fuzzy adaptation. We apply this method to find the goal state in the static and dynamic grid world. We compare our approach against the state of the art techniques and show that it outperforms these techniques both in terms of averaged achieved reward and also the convergence time.

查看译文

关键词

Reinforcement learning, Temporal difference, Watkinss , Fuzzy inference, Adaptive fuzzy Watkins (AFW)

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要