Parameters Optimization for Reinforcement Learning with Nonlinear Time-Varying Strategy by Using Uniform Experiment Design
2021 International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS)(2021)
摘要
This paper optimizes reinforcement learning (RL) parameters and makes agents solve problems more accurately and efficiently. The RL algorithm that the paper used is Q-learning, and the experimental environment is mazes. There are three parameters to influence the entire performance in RL, such as learning rate, greedy factor, and discount rate. This paper introduces a nonlinear time-varying strate...
更多查看译文
关键词
Q-learning,Communication systems,Signal processing algorithms,Signal processing,Time-varying systems,Optimization,Convergence
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要