Improving Imitation Learning by Merging Experts Trajectories

Conference on Information and Knowledge Management(2022)

引用 0|浏览8
暂无评分
摘要
ABSTRACTThis paper proposes an original approach based on expert trajectories combination and Deep Reinforcement Learning to provide a better MineCraft player. The combination is based on the idea that the problem is naturally decomposable and the search space presents large plateaus. We use two steps approach to build a better trajectory from all existed expert trajectories and consequently to extract an optimal policy. The first step uses Birch clustering approach and images cosine similarity to obtain compact representation and substantial state and action space reduction. To reduce the overall complexity, the image distances are computed in images latent space trained by an encoder-decoder model. In the second step, we first eliminate plateaus to keep only the nodes with non-zero rewards then we compare trajectories using the Bellman equation and an appropriate value function. By checking the incremental compatibility of the trajectory of compact representations, we build the solution combining the best compatible sub-trajectories of the experts. The experimental results on NeurIPS MineRL 2020 challenge show that training the actors model on the most rewarding extracted subset of trajectories leads to achieve state-of-the-art performances on the MineCraft environment. The paper's source code is available here: https://github.com/thomJeffDoe/CompareTrajectories.
更多
查看译文
关键词
imitation learning,trajectories,experts
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要