Rethinking Tabular Data Understanding with Large Language Models
CoRR(2023)
摘要
Large Language Models (LLMs) have shown to be capable of various tasks, yet
their capability in interpreting and reasoning over tabular data remains an
underexplored area. In this context, this study investigates from three core
perspectives: the robustness of LLMs to structural perturbations in tables, the
comparative analysis of textual and symbolic reasoning on tables, and the
potential of boosting model performance through the aggregation of multiple
reasoning pathways. We discover that structural variance of tables presenting
the same content reveals a notable performance decline, particularly in
symbolic reasoning tasks. This prompts the proposal of a method for table
structure normalization. Moreover, textual reasoning slightly edges out
symbolic reasoning, and a detailed error analysis reveals that each exhibits
different strengths depending on the specific tasks. Notably, the aggregation
of textual and symbolic reasoning pathways, bolstered by a mix self-consistency
mechanism, resulted in achieving SOTA performance, with an accuracy of 73.6
WIKITABLEQUESTIONS, representing a substantial advancement over previous
existing table processing paradigms of LLMs.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要