Selection of Appropriate Symbolic Regression Models Using Statistical and Dynamic System Criteria: Example of Waste Gasification

AXIOMS(2022)

引用 2|浏览3
暂无评分
摘要
In this paper, we analyze the interpretable models from real gasification datasets of the project "Centre for Energy and Environmental Technologies" (CEET) discovered by symbolic regression. To evaluate CEET models based on input data, two different statistical metrics to quantify their accuracy are usually used: Mean Square Error (MSE) and the Pearson Correlation Coefficient (PCC). However, if the testing points and the points used to construct the models are not chosen randomly from the continuum of the input variable, but instead from the limited number of discrete input points, the behavior of the model between such points very possibly will not fit well the physical essence of the modelled phenomenon. For example, the developed model can have unexpected oscillatory tendencies between the used points, while the usually used statistical metrics cannot detect these anomalies. However, using dynamic system criteria in addition to statistical metrics, such suspicious models that do fit well-expected behavior can be automatically detected and abandoned. This communication will show the universal method based on dynamic system criteria which can detect suitable models among all those which have good properties following statistical metrics. The dynamic system criteria measure the complexity of the candidate models using approximate and sample entropy. The examples are given for waste gasification where the output data (percentage of each particular gas in the produced mixture) is given only for six values of the input data (temperature in the chamber in which the process takes place). In such cases instead, to produce expected simple spline-like curves, artificial intelligence tools can produce inappropriate oscillatory curves with sharp picks due to the known tendency of symbolic regression to produce overfitted and relatively more complex models if the nature of the physical model is simple.
更多
查看译文
关键词
symbolic regression, Mean Square Error, Pearson Correlation Coefficient, oscillations in solutions, dynamic system criteria, waste gasification, Occam's Razor
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要