Forecasting PM2.5 concentration levels using shallow machine learning models on the Monterrey Metropolitan Area in Mexico

ATMOSPHERIC POLLUTION RESEARCH(2023)

引用 0|浏览3
暂无评分
摘要
The Monterrey Metropolitan Area is one of the most densely populated and polluted regions in Latin America. Hence, providing early warnings to the population when pollutant concentrations reach high levels is critical. This allows people at higher health risk to make informed decisions about when to go out, mitigating future health complications. Using forecasting models, we can produce timely warnings for future concentration levels. In this work, we implement a set of short-term shallow machine learning models that would serve as a baseline for future forecasting analyses of PM2.5 concentration levels in the Monterrey Metropolitan Area. The proposed approach starts with multiple imputation through chained equations for missing value imputation, the incorporation of time metadata, and target winsorization. Then, we rely on the well-known random search for parameter optimization of the machine learning models and k-fold cross-validation, obtaining favorable results. We devise these models for a single-step and single-station analysis on an hourly multivariate air quality dataset (containing 77203 rows and 16 columns from the first hour of January 1, 2015 00:00:00 to April 17, 2022 23:00:00) and compare them using standard regression metrics. Therefore, we identify the forecasting model with the best performance, which was an Extra Trees Regressor with a Root Mean Squared Error of 0.013, a Mean Absolute Error of 0.006 (equivalent to a Mean Absolute Percentage Error of 0.294% and a Symmetric Mean Absolute Percentage Error of 0.078%), and a Maximum Error of 0.187 mu g/m(3).
更多
查看译文
关键词
Air quality forecasting,PM2.5 forecasting,Machine learning,Regression
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要