A Language Model based Framework for New Concept Placement in Ontologies
CoRR(2024)
摘要
We investigate the task of inserting new concepts extracted from texts into
an ontology using language models. We explore an approach with three steps:
edge search which is to find a set of candidate locations to insert (i.e.,
subsumptions between concepts), edge formation and enrichment which leverages
the ontological structure to produce and enhance the edge candidates, and edge
selection which eventually locates the edge to be placed into. In all steps, we
propose to leverage neural methods, where we apply embedding-based methods and
contrastive learning with Pre-trained Language Models (PLMs) such as BERT for
edge search, and adapt a BERT fine-tuning-based multi-label Edge-Cross-encoder,
and Large Language Models (LLMs) such as GPT series, FLAN-T5, and Llama 2, for
edge selection. We evaluate the methods on recent datasets created using the
SNOMED CT ontology and the MedMentions entity linking benchmark. The best
settings in our framework use fine-tuned PLM for search and a multi-label
Cross-encoder for selection. Zero-shot prompting of LLMs is still not adequate
for the task, and we proposed explainable instruction tuning of LLMs for
improved performance. Our study shows the advantages of PLMs and highlights the
encouraging performance of LLMs that motivates future studies.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要