StructChart: Perception, Structuring, Reasoning for Visual Chart Understanding
CoRR(2023)
摘要
Charts are common in literature across different scientific fields, conveying
rich information easily accessible to readers. Current chart-related tasks
focus on either chart perception which refers to extracting information from
the visual charts, or performing reasoning given the extracted data, e.g. in a
tabular form. In this paper, we aim to establish a unified and label-efficient
learning paradigm for joint perception and reasoning tasks, which can be
generally applicable to different downstream tasks, beyond the
question-answering task as specifically studied in peer works. Specifically,
StructChart first reformulates the chart information from the popular tubular
form (specifically linearized CSV) to the proposed Structured Triplet
Representations (STR), which is more friendly for reducing the task gap between
chart perception and reasoning due to the employed structured information
extraction for charts. We then propose a Structuring Chart-oriented
Representation Metric (SCRM) to quantitatively evaluate the performance for the
chart perception task. To enrich the dataset for training, we further explore
the possibility of leveraging the Large Language Model (LLM), enhancing the
chart diversity in terms of both chart visual style and its statistical
information. Extensive experiments are conducted on various chart-related
tasks, demonstrating the effectiveness and promising potential for a unified
chart perception-reasoning paradigm to push the frontier of chart
understanding.
更多查看译文
关键词
visual structchart,structuring,perception,understanding,reasoning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要