On the Influence of Time-Correlation in Initial Training Data for Model-Based Policy Search

Elias Hanna,Stephane Doncieux

2023 21ST INTERNATIONAL CONFERENCE ON ADVANCED ROBOTICS, ICAR(2023)

引用 0|浏览0
暂无评分
摘要
Model-based policy search techniques are a way to quickly learn real-world robotics behaviors. Indeed, learning a model allows the robot to imagine interactions with its environment and to furthermore refine its behavior without actually interacting with it. Nevertheless, a good model is critical for such approaches to succeed, and initial model training steps can change drastically the performance of the policy search algorithm. In this paper, we propose to study the impact of various initial data gathering methods for bootstrapping model training in a model-based policy search algorithm. We compare five initialization methods, two being used in the state of the art and the last three one representing various degrees of timecorrelation. We then show a link between the time-correlation of the initialization method and an environment metric that we call consistency, empirically demonstrating on three robotic systems that model prediction error can be predicted from environment consistency and initialization method action sequences timecorrelation. We finally show the impact the initialization method has on a state of the art model-based policy search algorithm, demonstrating consistent results with the proposed metric and the initialization methods prediction errors, allowing for a better choice of initial training data beforehand.
更多
查看译文
关键词
Policy Search,Prediction Error,Sequence Of Actions,Robotic System,Model-based Algorithm,Prediction Error Of Model,Neural Network,Deep Neural Network,Real-world Data,Brownian Motion,Random Walk,Noise Sources,Markov Decision Process,Average Prediction,Transition Function,Environmental Agents,Generation Of Particles,Noise Spectrum,True Function,Real Robot,Mean Prediction Error,Random Action,Noise Sequence,Real Setup,Initial Returns,Autocorrelation Values,Time Series,Impact Of Initiatives,System Dynamics
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要