Vafa Spell-Checker For Detecting Spelling, Grammatical, And Real-Word Errors Of Persian Language

DIGITAL SCHOLARSHIP IN THE HUMANITIES(2016)

引用 10|浏览39
暂无评分
摘要
With advancements in industry and information technology, large volumes of electronic documents such as newspapers, emails, weblogs, and theses are produced daily. Producing electronic documents has considerable benefits such as easy organizing and data management. Therefore, existence of automatic systems such as spell and grammar-checker/correctors can help to improve their quality. In this article, the development of an automatic spelling, grammatical and realword error checker for Persian (Farsi) language, named Vafa Spell-Checker, is explained. Different kinds of errors in a text can be categorized into spelling, grammatical, and real-word errors. Vafa Spell-Checker is a hybrid system in which both rule-based and statistical approaches are used to detect/correct whole types of errors. The detection and correction phases of spelling and realword errors are fully statistical, while for the grammar-checker, a rule-based approach is proposed. Vafa Spell-Checker attempts to process these kinds of error types in an integrated system for Persian language. The results on the real-world collected test set indicate that continuing the work on grammarchecker requires statistical approaches. Evaluation results with respect to F-0.5 measure for spell-checker, grammar-checker, and real-word error checker are about 0.908, 0.452, and 0.187, respectively. Moreover, several free-usable language resources for Persian that are generated during this project are demonstrated in this article. These resources could be used in the further research in Persian language.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要