Combining Language and Graph Models for Semi-structured Information Extraction on the Web
CoRR(2024)
摘要
Relation extraction is an efficient way of mining the extraordinary wealth of
human knowledge on the Web. Existing methods rely on domain-specific training
data or produce noisy outputs. We focus here on extracting targeted relations
from semi-structured web pages given only a short description of the relation.
We present GraphScholarBERT, an open-domain information extraction method based
on a joint graph and language model structure. GraphScholarBERT can generalize
to previously unseen domains without additional data or training and produces
only clean extraction results matched to the search keyword. Experiments show
that GraphScholarBERT can improve extraction F1 scores by as much as 34.8%
compared to previous work in a zero-shot domain and zero-shot website setting.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要