Dual regularized policy updating and shiftpoint detection for automated deployment of reinforcement learning controllers on industrial mechatronic systems

CONTROL ENGINEERING PRACTICE(2024)

引用 0|浏览1
暂无评分
摘要
We propose an algorithmic pipeline enabling deep reinforcement learning controllers to detect when a significant change in system characteristics has occurred and update the control policy accordingly to reattain performance. Reinforcement learning algorithms can learn a policy directly from input-output data and thus optimize for system-specific properties. Yet they face difficulties to adapt, after deployment, to varying operating conditions. Real-world industrial mechatronic systems however demand further levels of performance through adaptation while remaining safe. So far, methods that detect changes in environments exist but have never been studied and applied as a means to update control policies for time-varying systems. We benchmark several methods that detect significant changes in these systems, i.e. shiftpoint detection methods, and present a novel algorithm with a dual regularization architecture. This architecture exploits the prior policy while allowing sufficient flexibility to update for the safety-critical and time-varying system. We validate the method's performance through benchhmarking and study the effect of its different components and targeted ablation studies on mechatronic systems, both in simulations and experimentally. Results show that our algorithmic pipeline allows for rapid shiftpoint detection, followed by a policy update that reaches expert performance after convergence.
更多
查看译文
关键词
Mechatronics,Motion control,Reinforcement learning,Shiftpoint detection,Policy updating,Uncertain systems
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要