Culturally-Attuned Moral Machines: Implicit Learning of Human Value Systems by AI through Inverse Reinforcement Learning
CoRR(2023)
摘要
Constructing a universal moral code for artificial intelligence (AI) is
difficult or even impossible, given that different human cultures have
different definitions of morality and different societal norms. We therefore
argue that the value system of an AI should be culturally attuned: just as a
child raised in a particular culture learns the specific values and norms of
that culture, we propose that an AI agent operating in a particular human
community should acquire that community's moral, ethical, and cultural codes.
How AI systems might acquire such codes from human observation and interaction
has remained an open question. Here, we propose using inverse reinforcement
learning (IRL) as a method for AI agents to acquire a culturally-attuned value
system implicitly. We test our approach using an experimental paradigm in which
AI agents use IRL to learn different reward functions, which govern the agents'
moral values, by observing the behavior of different cultural groups in an
online virtual world requiring real-time decision making. We show that an AI
agent learning from the average behavior of a particular cultural group can
acquire altruistic characteristics reflective of that group's behavior, and
this learned value system can generalize to new scenarios requiring altruistic
judgments. Our results provide, to our knowledge, the first demonstration that
AI agents could potentially be endowed with the ability to continually learn
their values and norms from observing and interacting with humans, thereby
becoming attuned to the culture they are operating in.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要