Non-linear probabilistic calibration of low-cost environmental air pollution sensor networks for neighborhood level spatiotemporal exposure assessment

Journal of Exposure Science & Environmental Epidemiology(2022)

引用 2|浏览5
暂无评分
摘要
Background Low-cost sensor networks for monitoring air pollution are an effective tool for expanding spatial resolution beyond the capabilities of existing state and federal reference monitoring stations. However, low-cost sensor data commonly exhibit non-linear biases with respect to environmental conditions that cannot be captured by linear models, therefore requiring extensive lab calibration. Further, these calibration models traditionally produce point estimates or uniform variance predictions which limits their downstream in exposure assessment. Objective Build direct field-calibration models using probabilistic gradient boosted decision trees (GBDT) that eliminate the need for resource-intensive lab calibration and that can be used to conduct probabilistic exposure assessments on the neighborhood level. Methods Using data from Plantower A003 particulate matter (PM) sensors deployed in Baltimore, MD from November 2018 through November 2019, a fully probabilistic NGBoost GBDT was trained on raw data from sensors co-located with a federal reference monitoring station and compared against linear regression trained on lab calibrated sensor data. The NGBoost predictions were then used in a Monte Carlo interpolation process to generate high spatial resolution probabilistic exposure gradients across Baltimore. Results We demonstrate that direct field-calibration of the raw PM 2.5 sensor data using a probabilistic GBDT has improved point and distribution accuracies compared to the linear model, particularly at reference measurements exceeding 25 μg/m 3 , and also on monitors not included in the training set. Significance We provide a framework for utilizing the GBDT to conduct probabilistic spatial assessments of human exposure with inverse distance weighting that predicts the probability of a given location exceeding an exposure threshold and provides percentiles of exposure. These probabilistic spatial exposure assessments can be scaled by time and space with minimal modifications. Here, we used the probabilistic exposure assessment methodology to create high quality spatial-temporal PM 2.5 maps on the neighborhood-scale in Baltimore, MD. Impact statement We demonstrate how the use of open-source probabilistic machine learning models for in-place sensor calibration outperforms traditional linear models and does not require an initial laboratory calibration step. Further, these probabilistic models can create uniquely probabilistic spatial exposure assessments following a Monte Carlo interpolation process. Graphical abstract
更多
查看译文
关键词
Exposure modeling,Air pollution,Sensors,Geospatial analyses
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要