Are Robots Sociopaths? A Neuroscientific Approach to the Alignment Problem

semanticscholar(2022)

引用 0|浏览0
暂无评分
摘要
Artificial intelligence (AI) is expanding into every niche of human life, organizing our activity, expanding our agency and interacting with us to an exponentially increasing extent. At the same time, AI’s efficiency, complexity and refinement are growing at an accelerating speed. An expanding, ubiquitous intelligence that does not have a means to care about us poses a species-level risk. Justifiably, there is a growing concern with the immediate problem of how to engineer an AI that is aligned with human interests. Computational approaches to the alignment problem currently focus on engineering AI systems to (i) parameterize human values such as harm and flourishing, and (ii) avoid overly drastic solutions, even if these are seemingly optimal. In parallel, ongoing work in applied AI (caregiving, consumer care) is concerned with developing artificial empathy, teaching AI’s to decode human feelings and behavior, and evince appropriate emotional responses.We propose that in the absence of affective empathy (which allows us to share in the states of others), existing approaches to artificial empathy may fail to reliably produce the pro-social, caring component of empathy, potentially resulting in increasingly cognitively complex sociopaths. We adopt the colloquial usage of the term “sociopath” to signify an intelligence possessing cognitive empathy (i.e., the ability to decode, infer, and model the mental and affective states of others), but crucially lacking pro-social, empathic concern arising from shared affect and embodiment. It is widely acknowledged that aversion to causing harm is foundational to the formation of empathy and moral behavior. However, harm aversion is itself predicated on the experience of harm, within the context of the preservation of physical integrity. Following from this, we argue that a “top-down” rule-based approach to achieving caring AI may be inherently unable to anticipate and adapt to the inevitable novel moral/logistical dilemmas faced by an expanding AI. Crucially, it may be more effective to coax caring to emerge from the bottom up, baked into an embodied, vulnerable artificial intelligence with an incentive to preserve its physical integrity. This may be achieved via iterative optimization within a series of tailored environments with incentives and contingencies inspired by the development of empathic concern in humans. Here we attempt an outline of what these training steps might look like. We speculate that work of this kind may allow for AI that surpasses empathic fatigue and the idiosyncrasies, biases, and computational limits that restrict human empathy. While for us, “a single death is a tragedy, a million deaths are a statistic”, the scaleable complexity of AI may allow it to deal proportionately with complex, large-scale ethical dilemmas. Hopefully, by addressing this problem seriously in the early stages of AI’s integration with society, we may one day be accompanied by AI that plans and behaves with a deeply ingrained weight placed on the welfare of others, coupled with the cognitive complexity necessary to understand and solve extraordinary problems.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要