Advances in Reinforcement Learning Structures for Continuous-time Dynamical Systems


引用 0|浏览0
This paper presents some new adaptive control structures based on reinforcement learning for computing online the solutions to optimal tracking control problems and multi-player differential games. We design a new family of adaptive controllers that converge in real time to optimal control and game theoretic solutions by using data measured along the system trajectories. This is a new approach to data-driven optimization. Integral reinforcement learning is used to develop policy iteration based algorithms that find optimal solutions online and do not require full knowledge of the system dynamics. A new experience replay technique is given that uses past data for present learning and significantly speeds up convergence. A new method of off-policy learning allows learning of optimal solutions without knowing any dynamic information. New algorithm will be presented for solving online the non zero-sum multi-player games for continuoustime systems. Each player maintains two adaptive learning structures, a critic network and an actor network. The result is an adaptive control system that learns based on the interplay of agents in a game, to deliver true online gaming behavior.
AI 理解论文
Chat Paper