Evaluation of CatBoost Method for Predicting Weekly Pan Evaporation in Subtropical and Sub-Humid Regions

Pure and Applied Geophysics(2024)

引用 0|浏览2
暂无评分
摘要
Pan evaporation modeling and forecasting are needed to provide timely, continuous, and valuable information to support water management. This study aimed to overcome the constraints identified in traditional regression techniques and less explored machine learning models—the CatBoost, to enhance the precision and comprehensiveness of pan evaporation estimation. Additionally, the CatBoost model was compared with other machine-learning approaches, viz., multiple linear regression (MLR), multiple non-linear least-squares regression (MNLSR), multivariate adaptive regression splines (MARS), random forest regression (RF), and M5 model tree (M5Tree). The algorithms were developed using data obtained from the Crop Research Center, Pantnagar- Uttarakhand, India. Stepwise regression was used as an input variable selection algorithm for selecting the best relevant input from several meteorological variables. For the model’s development, data were split into two subsets: the first three years were reserved for model calibration, while the fourth-year data was used for model validation. Statistical analyses, namely root mean square error (RMSE), Nash–Sutcliffe efficiency (NSE), Willmott index of agreement (WI), Pearson correlation coefficient (PCC), coefficient of determination (R 2 ), and mean absolute error (MAE) were used to assess the effectiveness of weekly pan evaporation estimation models. Findings indicated that the CatBoost model, based on four predictors, was the most accurate at Pantnagar station, with NSE, WI, MAE, RMSE, PCC and R 2 ranging from 0.9838 to 0.9877, 0.9960–0.9968, 0.1991–0.2078, 0.3643–0.3650–0.9955, 0.9925 and 0.9851–0.9911, respectively. RF model was found to be the second-most best-performing model in terms of in accuracy and effectiveness for mapping weekly pan evaporation, followed by MARS, MNLSR, MLR, and M5Tree. In conclusion, CatBoost’s approach of considering combinations greedily implies a thoughtful and cautious selection of features during the modeling process. This can enhance the model's robustness and generalization capability, contributing to more accurate predictions in diverse environmental conditions in subtropical and sub-humid regions.
更多
查看译文
关键词
Pan evaporation,MLR,MNLSR,MARS,Random forest,CatBoost
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要