Integrating Large Language Models in Causal Discovery: A Statistical Causal Approach
CoRR(2024)
摘要
In practical statistical causal discovery (SCD), embedding domain expert
knowledge as constraints into the algorithm is widely accepted as significant
for creating consistent meaningful causal models, despite the recognized
challenges in systematic acquisition of the background knowledge. To overcome
these challenges, this paper proposes a novel methodology for causal inference,
in which SCD methods and knowledge based causal inference (KBCI) with a large
language model (LLM) are synthesized through "statistical causal prompting
(SCP)" for LLMs and prior knowledge augmentation for SCD. Experiments have
revealed that GPT-4 can cause the output of the LLM-KBCI and the SCD result
with prior knowledge from LLM-KBCI to approach the ground truth, and that the
SCD result can be further improved, if GPT-4 undergoes SCP. Furthermore, it has
been clarified that an LLM can improve SCD with its background knowledge, even
if the LLM does not contain information on the dataset. The proposed approach
can thus address challenges such as dataset biases and limitations,
illustrating the potential of LLMs to improve data-driven causal inference
across diverse scientific domains.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要