Combining classifiers for Chinese word segmentation

SIGHAN '02: Proceedings of the first SIGHAN workshop on Chinese language processing - Volume 18(2002)

引用 50|浏览0
暂无评分
摘要
In this paper we report results of a supervised machine-learning approach to Chinese word segmentation. First, a maximum entropy tagger is trained on manually annotated data to automatically labels the characters with tags that indicate the position of character within a word. An error-driven transformation-based tagger is then trained to clean up the tagging inconsistencies of the first tagger. The tagged output is then converted into segmented text. The preliminary results show that this approach is competitive compared with other supervised machine-learning segmenters reported in previous studies.
更多
查看译文
关键词
combining classifier,supervised machine-learning segmenters,previous study,supervised machine-learning approach,preliminary result,annotated data,tagging inconsistency,maximum entropy tagger,segmented text,error-driven transformation-based tagger,chinese word segmentation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要