Buckwalter-based Lookup Tool as Language Resource for Arabic Language Learners.
SETQA-NLP '08: Software Engineering, Testing, and Quality Assurance for Natural Language Processing(2008)
摘要
The morphology of the Arabic language is rich and complex; words are inflected to express variations in tense-aspect, person, number, and gender, while they may also appear with clitics attached to express possession on nouns, objects on verbs and prepositions, and conjunctions. Furthermore, Arabic script allows the omission of short vowel diacritics. For the Arabic language learner trying to understand non-diacritized text, the challenge when reading new vocabulary is first to isolate individual words within text tokens and then to determine the underlying lemma and root forms to look up the word in an Arabic dictionary.
更多查看译文
关键词
Arabic dictionary,Arabic language,Arabic language learner,Arabic script,non-diacritized text,text token,individual word,new vocabulary,root form,short vowel diacritic,language resource,Buckwalter-based lookup tool
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络