A Corpus Based Technique For Repairing Ill-Formed Sentences With Word Order Errors Using Co-Occurrences Of N-Grams

INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS(2011)

引用 3|浏览14
暂无评分
摘要
There are several reasons to expect that recognising word order errors in a text will be a difficult problem, and recognition rates reported in the literature are in fact low. Although grammatical rules constructed by computational linguists improve the performance of a grammar checker in word order diagnosis, the repairing task is still very difficult. This paper describes a method to repair any sentence with wrong word order using a statistical language model (LM). A good indicator of whether a person really knows a language is the ability to use the appropriate words in a sentence in correct word order. The "scrambled" words in a sentence produce a meaningless sentence. Most languages have a fairly fixed word order. This paper introduces a method, which is language independent, for repairing word order errors in sentences using the probabilities of most typical trigrams and bigrams extracted from a large text corpus such as the British National Corpus (BNC).
更多
查看译文
关键词
Word order errors, statistical language model, permutations filtering, British National Corpus
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要