Enhancing Policy Gradient with the Polyak Step-Size Adaption
arxiv(2024)
摘要
Policy gradient is a widely utilized and foundational algorithm in the field
of reinforcement learning (RL). Renowned for its convergence guarantees and
stability compared to other RL algorithms, its practical application is often
hindered by sensitivity to hyper-parameters, particularly the step-size. In
this paper, we introduce the integration of the Polyak step-size in RL, which
automatically adjusts the step-size without prior knowledge. To adapt this
method to RL settings, we address several issues, including unknown f* in the
Polyak step-size. Additionally, we showcase the performance of the Polyak
step-size in RL through experiments, demonstrating faster convergence and the
attainment of more stable policies.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要