Evaluating Machine Leaning Algorithms for Accuracy, Stability, and Among-Predictors Discriminability in Modeling Species-Richness Across Ten Datasets
Ecological Informatics(2025)
Abstract
Global biodiversity is experiencing substantial declines, and mitigating this crisis requires analytical approaches that can accurately predict biodiversity in relation to natural conditions and human-induced stressors. While numerous machine learning (ML) algorithms for regression are available for such analyses, synthesizing outcomes across studies is challenging due to: (1) reliance on single datasets, limiting generalizability; (2) varying modeling processes; (3) inconsistent performance criteria; and (4) limited consideration of model stability and among-predictor discriminability.We addressed these issues by applying five ML algorithms—Random Forest (RF), Boosted Regression Tree (BRT), Extreme Gradient Boosting (XGB), Conditional Inference Forest (CIF), and Lasso—to ten large datasets on freshwater fish, mussels, and caddisflies. Using consistent modeling methods, we evaluated accuracy (R2 and RMSE), stability (coefficient of variation of R2 and RMSE), and among-predictors discriminability (variation in predictor importance).RF, BRT, and XGB generally achieved higher accuracy than CIF and Lasso, although performance varied by dataset. CIF, however, was the most stable (average CoV-R2 = 0.12), followed by RF, XGB, and BRT (0.13–0.15). BRT was most effective at distinguishing among predictors, followed by CIF and Lasso. Considering all criteria, CIF, XGB, and BRT ranked similarly high, followed by RF and Lasso. The top three models also showed similar predictor rankings, while RF and Lasso differed. Reducing predictors by 58 % had little effect on accuracy or stability, and averaging predictions across replicate models should mitigate the effects of model stability. These findings support more robust ML applications in biodiversity research.
MoreTranslated text
Key words
Biodiversity,Freshwater fish,Supervised data mining,Penalized linear regression,Big data,Feature selection
求助PDF
上传PDF
View via Publisher
AI Read Science
AI Summary
AI Summary is the key point extracted automatically understanding the full text of the paper, including the background, methods, results, conclusions, icons and other key content, so that you can get the outline of the paper at a glance.
Example
Background
Key content
Introduction
Methods
Results
Related work
Fund
Key content
- Pretraining has recently greatly promoted the development of natural language processing (NLP)
- We show that M6 outperforms the baselines in multimodal downstream tasks, and the large M6 with 10 parameters can reach a better performance
- We propose a method called M6 that is able to process information of multiple modalities and perform both single-modal and cross-modal understanding and generation
- The model is scaled to large model with 10 billion parameters with sophisticated deployment, and the 10 -parameter M6-large is the largest pretrained model in Chinese
- Experimental results show that our proposed M6 outperforms the baseline in a number of downstream tasks concerning both single modality and multiple modalities We will continue the pretraining of extremely large models by increasing data to explore the limit of its performance
Upload PDF to Generate Summary
Must-Reading Tree
Example

Generate MRT to find the research sequence of this paper
Related Papers
The Impact of IL-1 Modulation on the Development of Lipopolysaccharide-Induced Cognitive Dysfunction
2010
被引用237 | 浏览
2010
被引用72 | 浏览
2019
被引用89 | 浏览
2020
被引用1199 | 浏览
2021
被引用203 | 浏览
2021
被引用72 | 浏览
2022
被引用21 | 浏览
2023
被引用10 | 浏览
2023
被引用9 | 浏览
2024
被引用6 | 浏览
Data Disclaimer
The page data are from open Internet sources, cooperative publishers and automatic analysis results through AI technology. We do not make any commitments and guarantees for the validity, accuracy, correctness, reliability, completeness and timeliness of the page data. If you have any questions, please contact us by email: report@aminer.cn
Chat Paper