Robust Multi-task Learning-based Korean POS Tagging to OvercomeWord Spacing Errors

ACM Transactions on Asian and Low-Resource Language Information Processing(2023)

引用 0|浏览4
暂无评分
摘要
End-to-end neural network-based approaches have recently demonstrated significant improvements in natural language processing (NLP). However, in the NLP application such as assistant systems, NLP components are still processed to extract results using a pipeline paradigm. The pipeline-based concept has issues with error propagation. In Korean, morphological analysis and part-of-speech (POS) tagging step, incorrectly analyzing POS tags for a sentence containing spacing errors negatively affects other modules behind the POS module. Hence, we present a multi-task learning-based POS tagging neural model for Korean with word spacing challenges. When we apply this model to the Korean morphological analysis and POS tagging, we get findings that are robust to word spacing errors. We adopt syllable-level input and output formats, as well as a simple structure for ELECTRA and RNN-CRF models for multi-task learning, and we achieve a good performance 98.30 of F1, better than previous studies on the Sejong corpus test set.
更多
查看译文
关键词
Morphological analysis, part-of-speech tagging, word spacing, multi-task learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要