Where Do Rewards Come From?

Satinder Singh,Richard L. Lewis,Andrew G. Barto

msra（2010）

引用 41|浏览37

暂无评分

摘要

Reinforcement learning has achieved broad and successful ap- plication in cognitive science in part because of its general for- mulation of the adaptive control problem as the maximization of a scalar reward function. The computational reinforcement learning framework is motivated by correspondences to ani- mal reward processes, but it leaves the source and nature of the rewards unspecified. This paper advances a general computa- tional framework for reward that places it in an evolutionary context, formulating a notion of an optimal reward function given a fitness function and some distribution of environments. Novel results from computational experiments show how tra- ditional notions of extrinsically and intrinsically motivated be- haviors may emerge from such optimal reward functions. In the experiments these rewards are discovered through auto- mated search rather than crafted by hand. The precise form of the optimal reward functions need not bear a direct relationship to the fitness function, but may nonetheless confer significant advantages over rewards based only on fitness.

查看译文

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要