Neural Collaborative Graph Machines for Table Structure Recognition

IEEE Conference on Computer Vision and Pattern Recognition(2022)

引用 18|浏览56
暂无评分
摘要
Recently, table structure recognition has achieved impressive progress with the help of deep graph models. Most of them exploit single visual cues of tabular elements or simply combine visual cues with other modalities via early fusion to reason their graph relationships. However, neither early fusion nor individually reasoning in terms of multiple modalities can be appropriate for all varieties of table structures with great diversity. Instead, different modalities are expected to collaborate with each other in different patterns for different table cases. In the community, the importance of intrainter modality interactions for table structure reasoning is still unexplored. In this paper, we define it as heterogeneous table structure recognition (HeteroTSR) problem. With the aim offilling this gap, we present a novel Neural Collaborative Graph Machines (NCGM) equipped with stacked collaborative blocks, which alternatively extracts intramodality context and models inter-modality interactions in a hierarchical way. It can represent the intrainter modality relationships of tabular elements more robustly, which significantly improves the recognition performance. We also show that the proposed NCGM can modulate collaborative pattern of different modalities conditioned on the context of intramodality cues, which is vital for diversified table cases. Experimental results on benchmarks demonstrate our proposed NCGM achieves state-of-the-art performance and beats other contemporary methods by a large margin especially under challenging scenarios.
更多
查看译文
关键词
Document analysis and understanding, Visual reasoning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要