LLM-SR: Scientific Equation Discovery via Programming with Large Language Models
arxiv(2024)
摘要
Mathematical equations have been unreasonably effective in describing complex
natural phenomena across various scientific disciplines. However, discovering
such insightful equations from data presents significant challenges due to the
necessity of navigating extremely high-dimensional combinatorial and nonlinear
hypothesis spaces. Traditional methods of equation discovery largely focus on
extracting equations from data alone, often neglecting the rich domain-specific
prior knowledge that scientists typically depend on. To bridge this gap, we
introduce LLM-SR, a novel approach that leverages the extensive scientific
knowledge and robust code generation capabilities of Large Language Models
(LLMs) to discover scientific equations from data in an efficient manner.
Specifically, LLM-SR treats equations as programs with mathematical operators
and combines LLMs' scientific priors with evolutionary search over equation
programs. The LLM iteratively proposes new equation skeletons, drawing from its
physical understanding, which are then optimized against data to estimate
skeleton parameters. We demonstrate LLM-SR's effectiveness across three diverse
scientific domains, where it discovers physically accurate equations that
provide significantly better fits to in-domain and out-of-domain data compared
to the well-established equation discovery baselines
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要