Lexical translation with application to image searching on the web.

MTSummit(2007)

引用 70|浏览44
暂无评分
摘要
We introduce a novel approach to the task of lexical translation. We utilize the translation graph, a massive lexical resource where each node denotes a word in some language and each edge denotes a word sense shared by a pair of words. Our current graph contains 1,267,460 nodes and 2,315,783 edges. The graph is automatically constructed from machine-readable dictionaries and Wiktionaries. Paths through the graph suggest word transla- tions absent from any of the input dictionaries. We define a probabilistic inference procedure that enables us to quantify our confidence in a translation derived from the graph, and thus trade precision against recall. We demonstrate the graph's utility by employing it in the PANIMAGES cross-lingual image search engine. Google retrieves images based on the words in their "vicinity", which limits the ability of a searcher to retrieve them. Although images are universal, an English searcher will fail to find images tagged in Chinese, and con- versely. PANIMAGES addresses this problem by translating and disambiguating queries, using the translation graph, before sending them to Google. Our experiments show that, for queries in "minor" languages, PANIM- AGES increases the number of correct images in the first 15 pages of results by 75%.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要