Interactively Generating Explanations for Transformer-based Language Models

arxiv(2021)

引用 0|浏览11
暂无评分
摘要
Transformer language models are state-of-the-art in a multitude of NLP tasks. Despite these successes, their opaqueness remains problematic. Recent methods aiming to provide interpretability and explainability to black-box models primarily focus on post-hoc explanations of (sometimes spurious) input-output correlations. Instead, we emphasize using prototype networks directly incorporated into the model architecture and hence explain the reasoning process behind the network's decisions. Moreover, while our architecture performs on par with several language models, it enables one to learn from user interactions. This not only offers a better understanding of language models, but uses human capabilities to incorporate knowledge outside of the rigid range of purely data-driven approaches.
更多
查看译文
关键词
transformer language models,explanations
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要