A Graph is Worth K Words: Euclideanizing Graph using Pure Transformer
CoRR(2024)
摘要
Can we model non-Euclidean graphs as pure language or even Euclidean vectors
while retaining their inherent information? The non-Euclidean property have
posed a long term challenge in graph modeling. Despite recent GNN and
Graphformer efforts encoding graphs as Euclidean vectors, recovering original
graph from the vectors remains a challenge. We introduce GraphsGPT, featuring a
Graph2Seq encoder that transforms non-Euclidean graphs into learnable graph
words in a Euclidean space, along with a GraphGPT decoder that reconstructs the
original graph from graph words to ensure information equivalence. We pretrain
GraphsGPT on 100M molecules and yield some interesting findings: (1) Pretrained
Graph2Seq excels in graph representation learning, achieving state-of-the-art
results on 8/9 graph classification and regression tasks. (2) Pretrained
GraphGPT serves as a strong graph generator, demonstrated by its ability to
perform both unconditional and conditional graph generation. (3)
Graph2Seq+GraphGPT enables effective graph mixup in the Euclidean space,
overcoming previously known non-Euclidean challenge. (4) Our proposed novel
edge-centric GPT pretraining task is effective in graph fields, underscoring
its success in both representation and generation.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要