A CLIR-oriented OOV translation mining method from bilingual webpages

ICMLC(2011)

引用 2|浏览17
暂无评分
摘要
Translating unknown terms is a major bottleneck for cross-language IR. An effective solution to relevant webpage detection, translation extraction with correct boundaries, and candidate translation ranking is proposed. Topic word translations are used to expand the source query and collect bilingual search engine snippets. Then an improved Frequency Change Measurement method is used to extract valid candidates from noisy, small bilingual corpora. To choose the translation, frequency-distance, surface patterns and phonetic features are used to pick out the correct translation. Experimental results show an impressive performance for unknown term translation mining.
更多
查看译文
关键词
translation extraction,web mining,search engine,word translations,frequency change measurement method,clir oriented oov translation mining method,cross-language ir,internet,bilingual search engine snippets,data mining,natural language processing,translation ranking,bilingual webpages,webpage detection,pattern matching,cybernetics,machine learning,noise,search engines
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要