Identifying Broken Plurals, Irregular Gender, and Rationality in Arabic Text.

EACL '12: Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics(2012)

引用 6|浏览31
暂无评分
摘要
Arabic morphology is complex, partly because of its richness, and partly because of common irregular word forms, such as broken plurals (which resemble singular nouns), and nouns with irregular gender (feminine nouns that look masculine and vice versa). In addition, Arabic morpho-syntactic agreement interacts with the lexical semantic feature of rationality, which has no morphological realization. In this paper, we present a series of experiments on the automatic prediction of the latent linguistic features of functional gender and number, and rationality in Arabic. We compare two techniques, using simple maximum likelihood (MLE) with back-off and a support vector machine based sequence tagger (Yamcha). We study a number of orthographic, morphological and syntactic learning features. Our results show that the MLE technique is preferred for words seen in the training data, while the Yam-cha technique is optimal for unseen words, which are our real target. Furthermore, we show that for unseen words, morphological features help beyond orthographic features and that syntactic features help even more. A combination of the two techniques improves overall performance even further.
更多
查看译文
关键词
unseen word,Arabic morpho-syntactic agreement interacts,Arabic morphology,morphological feature,morphological realization,MLE technique,Yam-cha technique,common irregular word form,functional gender,irregular gender,Arabic text,broken plural
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要