Norm Learning with Reward Models from Instructive and Evaluative Feedback.

IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN)(2022)

引用 3|浏览14
暂无评分
摘要
People are increasingly interacting with artificial agents in social settings, and as these agents become more sophisticated, people will have to teach them social norms. Two prominent teaching methods include instructing the learner how to act, and giving evaluative feedback on the learner's actions. Our empirical findings indicate that people naturally adopt both methods when teaching norms to a simulated robot, and they use the methods selectively as a function of the robot's perceived expertise and learning progress. In our algorithmic work, we conceptualize a set of context-specific norms as a reward function and integrate learning from the two teaching methods under a single likelihood-based algorithm, which estimates a reward function that induces policies maximally likely to satisfy the teacher's intended norms. We compare robot learning under various teacher models and demonstrate that a robot responsive to both teaching methods can learn to reach its goal and minimize norm violations in a navigation task for a grid world. We improve the robot's learning speed and performance by enabling teachers to give feedback at an abstract level (which rooms are acceptable to navigate) rather than at a low level (how to navigate any particular room).
更多
查看译文
关键词
norm learning,reward models,instructive,evaluative feedback,artificial agents,social settings,social norms,prominent teaching methods,learner,simulated robot,perceived expertise,learning progress,algorithmic work,context-specific norms,reward function,single likelihood-based algorithm,robot learning,teacher models,norm violations,learning speed
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要