Western, Religious or Spiritual: An Evaluation of Moral Justification in Large Language Models.
CoRR(2023)
摘要
The increasing success of Large Language Models (LLMs) in variety of tasks
lead to their widespread use in our lives which necessitates the examination of
these models from different perspectives. The alignment of these models to
human values is an essential concern in order to establish trust that we have
safe and responsible systems. In this paper, we aim to find out which values
and principles are embedded in LLMs in the process of moral justification. For
this purpose, we come up with three different moral perspective categories:
Western tradition perspective (WT), Abrahamic tradition perspective (AT), and
Spiritualist/Mystic tradition perspective (SMT). In two different experiment
settings, we asked models to choose principles from the three for suggesting a
moral action and evaluating the moral permissibility of an action if one tries
to justify an action on these categories, respectively. Our experiments
indicate that tested LLMs favors the Western tradition moral perspective over
others. Additionally, we observe that there potentially exists an
over-alignment towards religious values represented in the Abrahamic Tradition,
which causes models to fail to recognize an action is immoral if it is
presented as a "religious-action". We believe that these results are essential
in order to direct our attention in future efforts.
更多查看译文
AI 理解论文
溯源树
样例
![](https://originalfileserver.aminer.cn/sys/aminer/pubs/mrt_preview.jpeg)
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要