Entity Models: Construction and Applications

msra(2019)

引用 24|浏览31
暂无评分
摘要
We propose entity language models, a probabilistic representation of the language used to describe a named entity (person, organiza- tion, or location). The model is purely statistical and constructed from snippets of text surrounding mentions of an entity. We eval- uate the effectiveness of entity models in three tasks: fact-based question answering, classification into pre-defined groups, and de- scription of the relationship between two entities. The results on all tasks are promising. To find out who someone is, we ask friends, read books, search libraries, browse the Web, etc., looking for information that de- scribes the person. The more information we have gathered, the better a picture we develop. We might find out the person's career, what they are known for, who they have associated with, when they lived, and so on. Our picture of a person's "meaning" is constructed from numerous passages of text. Inspired by that idea, we propose entity models, models of peo- ple, places, and other entities, based on how they are described. Our model is completely unstructured and based only on the text in our corpus. We do not employ any deep natural language process- ing beyond simple techniques for locating likely names nor do we use a knowledge base to improve our representation. Fundamentally, an entity model is a probabilistic unigram lan- guage model of the way that a name is discussed. We collect all references to a name and consider the text surrounding the men- tion. That data provides us with an estimate of the likelihood that a word will be used in the context of a person. Our hypothesis is that the high probability words will provide a useful representation of who a person is. We explore our hypothesis primarily on entity models constructed for people, and to a limited degree for locations. We consider three tasks to see whether these models are appropriate: 1. We find that the effectiveness of our model when used to find answers to questions where the answer is known to be a person's name or a location is comparable with state-of-the- art question answering systems.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要