Label-free Node Classification on Graphs with Large Language Models (LLMS)
arXiv (Cornell University)(2023)
摘要
In recent years, there have been remarkable advancements in node
classification achieved by Graph Neural Networks (GNNs). However, they
necessitate abundant high-quality labels to ensure promising performance. In
contrast, Large Language Models (LLMs) exhibit impressive zero-shot proficiency
on text-attributed graphs. Yet, they face challenges in efficiently processing
structural data and suffer from high inference costs. In light of these
observations, this work introduces a label-free node classification on graphs
with LLMs pipeline, LLM-GNN. It amalgamates the strengths of both GNNs and LLMs
while mitigating their limitations. Specifically, LLMs are leveraged to
annotate a small portion of nodes and then GNNs are trained on LLMs'
annotations to make predictions for the remaining large portion of nodes. The
implementation of LLM-GNN faces a unique challenge: how can we actively select
nodes for LLMs to annotate and consequently enhance the GNN training? How can
we leverage LLMs to obtain annotations of high quality, representativeness, and
diversity, thereby enhancing GNN performance with less cost? To tackle this
challenge, we develop an annotation quality heuristic and leverage the
confidence scores derived from LLMs to advanced node selection. Comprehensive
experimental results validate the effectiveness of LLM-GNN. In particular,
LLM-GNN can achieve an accuracy of 74.9
a cost less than 1 dollar.
更多查看译文
关键词
large language models,graphs,node,llms,label-free
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要