A Novel Cascade Instruction Tuning Method for Biomedical NER

ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)(2024)

引用 0|浏览9
暂无评分
摘要
Large language models(LLMs) have achieved remarkable performance on various tasks. However, LLMs suffer from severe limitations in domain generalisation, primarily due to inherent limitations. Closed-source LLMs face constraints in fine-tuning, while open-source LLMs contend with the scarcity of domain-specific data. Moreover, LLMs often prioritize addressing standard patterns, inadvertently neglecting intricate and domain-specific patterns. This preference hampers effective domain generalization. In this paper, inspired by curriculum learning, we explore a cascade instruction tuning method to train a domain-specific LLMs that can excel in a broad application such as information extraction. Taking biomedical name entity recognition(BioNER) as a case study, we show how to cultivate general LLMs into domain-specific LLMs with limited domain data and address the complex pattern for BioNER. To validate our method, we construct NER INSTRUCTIONS, the largest and broadest benchmark sourced from 55 publicly available NER datasets across 17 domains. We conduct extensive experiments on the dataset, and the results demonstrate the effectiveness of our proposed framework in downstream task generalisation and its ability to tackle intricate patterns.
更多
查看译文
关键词
large language model,instruction learning,biomedical named entity recognition,domain adaption
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要