Benign Nonconvex Landscapes in Optimal and Robust Control, Part I: Global Optimality

Yang Zheng, Chih-fan Pai,Yujie Tang

CoRR(2023)

引用 0|浏览0
暂无评分
摘要
Direct policy search has achieved great empirical success in reinforcement learning. Many recent studies have revisited its theoretical foundation for continuous control, which reveals elegant nonconvex geometry in various benchmark problems, especially in fully observable state-feedback cases. This paper considers two fundamental optimal and robust control problems with partial observability: the Linear Quadratic Gaussian (LQG) control with stochastic noises, and ℋ_∞ robust control with adversarial noises. In the policy space, the former problem is smooth but nonconvex, while the latter one is nonsmooth and nonconvex. We highlight some interesting and surprising “discontinuity” of LQG and ℋ_∞ cost functions around the boundary of their domains. Despite the lack of convexity (and possibly smoothness), our main results show that for a class of non-degenerate policies, all Clarke stationary points are globally optimal and there is no spurious local minimum for both LQG and ℋ_∞ control. Our proof techniques rely on a new and unified framework of Extended Convex Lifting (ECL), which reconciles the gap between nonconvex policy optimization and convex reformulations. This ECL framework is of independent interest, and we will discuss its details in Part II of this paper.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要