Pseudo-MDPs and factored linear action models
ADPRL(2014)
摘要
In this paper we introduce the concept of pseudo-MDPs to develop abstractions. Pseudo-MDPs relax the requirement that the transition kernel has to be a probability kernel. We show that the new framework captures many existing abstractions. We also introduce the concept of factored linear action models; a special case. Again, the relation of factored linear action models and existing works are discussed. We use the general framework to develop a theory for bounding the suboptimality of policies derived from pseudo-MDPs. Specializing the framework, we recover existing results. We give a leastsquares approach and a constrained optimization approach of learning the factored linear model as well as efficient computation methods. We demonstrate that the constrained optimization approach gives better performance than the least-squares approach with normalization.
更多查看译文
关键词
pseudo-mdp,transition kernel,computation method,factored linear model,probability kernel,normalization,least-squares approach,constrained optimization approach,leastsquares approach,suboptimality,least squares approximations,constraint handling,markov processes,factored linear action model,probability
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络