Cross-Lingual Transfer Learning for Medical Named Entity Recognition

database systems for advanced applications(2020)

引用 2|浏览56
暂无评分
摘要
Extensive technologies have been employed to explore a best way for cross-lingual transfer learning. In medical domain, Named Entity Recognition is pivotal for many downstream tasks, such as medical entity linking and clinical decision support systems. Nevertheless, the lack of annotation limits the applicability in many languages without enough labeled data. To alleviate this issue and make use of languages with sufficient annotated data, we find a new way to obtain medical parallel corpus from medical terminology systems and knowledge bases and propose a methodology which combines cross-lingual language model pretraining and bilingual word embedding alignment with the help of the parallel corpus. Moreover, our combined architecture which maintains the framework of pretrained model can not only be used for NER task but also other downstream NLP tasks. Experiments demonstrated that incorporating Chinese and English medical data can effectively improve the performance for an English medical NER dataset (i2b2).
更多
查看译文
关键词
Transfer learning, Cross-lingual pretraining, Word embedding alignment, Medical terminology systems, Medical NER
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要