Transform then Explore: a Simple and Effective Technique for Exploratory Combinatorial Optimization with Reinforcement Learning
arxiv(2024)
摘要
Many complex problems encountered in both production and daily life can be
conceptualized as combinatorial optimization problems (COPs) over graphs.
Recent years, reinforcement learning (RL) based models have emerged as a
promising direction, which treat the COPs solving as a heuristic learning
problem. However, current finite-horizon-MDP based RL models have inherent
limitations. They are not allowed to explore adquately for improving solutions
at test time, which may be necessary given the complexity of NP-hard
optimization tasks. Some recent attempts solve this issue by focusing on reward
design and state feature engineering, which are tedious and ad-hoc. In this
work, we instead propose a much simpler but more effective technique, named
gauge transformation (GT). The technique is originated from physics, but is
very effective in enabling RL agents to explore to continuously improve the
solutions during test. Morever, GT is very simple, which can be implemented
with less than 10 lines of Python codes, and can be applied to a vast majority
of RL models. Experimentally, we show that traditional RL models with GT
technique produce the state-of-the-art performances on the MaxCut problem.
Furthermore, since GT is independent of any RL models, it can be seamlessly
integrated into various RL frameworks, paving the way of these models for more
effective explorations in the solving of general COPs.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要