Estimating Escherichia coli levels using drone-based RGB imagery and machine learning techniques

crossref(2024)

引用 0|浏览1
暂无评分
摘要
Rapid and efficient quantification of E. coli levels is the important goal of the microbial water quality assessment. To address this, remote sensing and machine learning algorithms have been used recently. Application of these techniques encounter challenges from a limited number of samples and imbalances in water quality datasets. This study focused on estimating E. coli concentrations in a Maryland irrigation pond during the summer season. We utilized demosaiced drone-based RGB imagery across visible and infrared spectrum ranges along with 14 water quality parameters. Employing four machine learning algorithms (Random Forest, Gradient Boosting Machine, Extreme Gradient Boosting, and K-nearest Neighbor) under three scenarios, the research explored the utilization of only water quality parameters, both water quality and drone-based RGB data, and finally, only RGB data. Two data splitting methods, traditional random data splitting (ordinary data splitting) and quantile data splitting, were employed, with the latter providing a constant splitting ratio across each decile of the E. coli concentration distribution. Quantile data splitting resulted in a very good model performances and smaller differences between training and testing datasets. The RF, GBM, and XGB models, trained with quantile data splitting and hyperparameter optimization, resulted in R2 values above 0.847 for training and 0.689 for the test dataset. The integration of water quality and imagery data led to larger R2 values exceeding 0.896 for the test dataset. Shapley additive explanations (SHAP) highlighted the visible blue spectrum intensity and water temperature as the most influential inputs to the RF model. Overall, demosaiced RGB imagery proved to be a valuable predictor for E. coli concentration across the studied irrigation pond.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要