AI helps you reading Science

AI generates interpretation videos

AI extracts and analyses the key points of the paper to generate videos automatically


pub
Go Generating

AI Traceability

AI parses the academic lineage of this thesis


Master Reading Tree
Generate MRT

AI Insight

AI extracts a summary of this paper


Weibo:
From the perspective of big data, we infer the finegranularity air quality in a city based on the AQIs reported by a few air quality monitor stations and four datasets observed in the city

U-Air: when urban air quality inference meets big data

KDD, pp.1436-1444, (2013)

Cited by: 839|Views192
EI

Abstract

Information about urban air quality, e.g., the concentration of PM2.5, is of great importance to protect human health and control air pollution. While there are limited air-quality-monitor-stations in a city, air quality varies in urban spaces non-linearly and depends on multiple factors, such as meteorology, traffic volume, and land uses...More

Code:

Data:

0
Introduction
  • Real-time air quality information, such as the concentration of NO2, PM2.5, and PM10, is of great importance to support air pollution control and protect humans from damage by air pollution.
  • There are insufficient air quality measurement stations in a city due to the expensive cost of building and maintaining such a station.
  • As demonstrated in Figure 1 A), an air quality monitor station usually needs a certain size of land, non-trivial money, and human resources to regularly take care of it.
Highlights
  • Real-time air quality information, such as the concentration of NO2, PM2.5, and PM10, is of great importance to support air pollution control and protect humans from damage by air pollution
  • We evaluated our approach using 5 data sources consisting of the point of interests, road networks, meteorological data, and air quality records of Beijing and Shanghai, and the GPS trajectories generated by over 30,000 taxis in Beijing, justifying the advantages of our approach over 4 baselines
  • Evaluation on Features: We first justify the effectiveness of the features, using the data shown in Table 5, where a DT model is employed to study the performance of individual features and their combinations
  • Using some supervised machine learning models or artificial neural network is less effective than the cotraining-based approach
  • From the perspective of big data, we infer the finegranularity air quality in a city based on the AQIs reported by a few air quality monitor stations and four datasets observed in the city
  • Based on the datasets and propose a co-training-based semisupervised learning approach consisting of a spatial classifier and a temporal classifier
Methods
  • 5.1 Datasets

    In the evaluation the authors use the following five real datasets detailed in Table 4, where the first four sources are available in Beijing and Shanghai.

    1) Meteorological data: The authors collect fine-grained meteorological data, consisting of weather, temperature, humidity, barometer pressure, wind strength, from a public website every hour.

    2) Air quality records: The authors collect both real valued and labeled AQI of four kinds of air pollutants, consisting of SO2, NO2, PM2.5, and PM10, reported by ground-based air quality monitor stations in the four cities every hour.
  • In the evaluation the authors use the following five real datasets detailed in Table 4, where the first four sources are available in Beijing and Shanghai.
  • 2) Air quality records: The authors collect both real valued and labeled AQI of four kinds of air pollutants, consisting of SO2, NO2, PM2.5, and PM10, reported by ground-based air quality monitor stations in the four cities every hour.
  • As a station may not have reports sometimes, the authors present the hours of effective records in Table 4.
Results
  • To further study the ability of the approach in differentiating between more AQI labels, the authors solely test the spatial classifier.
  • Note that this is the result of SC rather than co-training.
  • The two time slots correspond to the morning and evening rush hours of Beijing, in which traffic flows would be the major cause of air pollutants.
Conclusion
  • From the perspective of big data, the authors infer the finegranularity air quality in a city based on the AQIs reported by a few air quality monitor stations and four datasets observed in the city.
  • The authors first evaluated the co-training-based approach using the data obtained in Beijing, resulting in an overall (Precision=0.828, Recall=0.826) for PM10 and (Precision=0.808, Recall=0.798) for NO2.
  • The authors applied the SC learnt from Beijing data to Shanghai, obtaining a result as good as that generated in Beijing.
  • These results demonstrate the approach is applicable to different city environments and seasons
Tables
  • Table1: AQI values, descriptors, and color codes
  • Table2: Table 2
  • Table3: Category of POIs we studied
  • Table4: Details of the datasets
  • Table5: Results related to features
  • Table6: Confusion matrix of U-Air on PM10
  • Table7: Confusion matrix of the Spatial Classifier
  • Table8: Efficiency study
Download tables as Excel
Related work
  • 6.1 Classical Bottom-up Emission Models

    There are two major ways calculating the air quality of a location using the emission observed at ground surfaces, called “bottom-up” methods. One is interpolation using the reports from nearby air quality monitor stations. The method is usually employed by public websites releasing AQIs. As air quality varies in locations non-linearly, the inference accuracy is quite low (see Figure 15).

    The other is classical dispersion models, such as Gaussian Plume models, Operational Street Canyon models, and Computational Fluid Dynamics. These models are in most cases a function of meteorology, street geometry, receptor locations, traffic volumes, and emission factors (e.g., g/km per single vehicle), based on a number of empirical assumptions and parameters that might not be applicable to all urban environments [9]. For example, Gaussian Plume model requires vehicle emission rates (e.g., g/km per hour) as input and assumes that the concentration is dispersed in the vertical and horizontal directions in a Gaussian manner. Some models may even require the height, length, and orientation of a street canyon, the gaps between buildings, as well as the roughness coefficient of the urban surface. As these parameters are difficult to obtain precisely, the results generated by such kinds of models may not be very accurate either. Compared with these models, our approach does not need empirical assumptions and parameters. Therefore, it is easy to conduct and applicable to different city environments.
Reference
  • A. V. Donkelaar, R. V. Martin, and R. J. Park (2006), Estimating ground-level PM2.5 using aerosol optical depth determined from satellite remote sensing, J. Geophys. Res., 111, D21201.
    Google ScholarLocate open access versionFindings
  • D. Hasenfratz, O. Saukh, S. Sturzenegger, and L. Thiele. Participatory Air Pollution Monitoring Using Smartphones. In the 2nd International Workshop on Mobile Sensing.
    Google ScholarLocate open access versionFindings
  • Y. Jiang, K. Li, L. Tian, R. Piedrahita, X. Yun, O. Mansata, Q. Lv, R. P. Dick, M. Hannigan, and L. Shang. Maqs: A personalized mobile sensing system for indoor air quality. In Proc. of UbiComp 2011.
    Google ScholarLocate open access versionFindings
  • J. Lafferty, A. McCallum, F. Pereira (2001). Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proc. of 18th International Conf. on Machine Learning.
    Google ScholarLocate open access versionFindings
  • S. Ma, Y. Zheng, O. Wolfson. T-Share: A Large-Scale Dynamic Taxi Ridesharing Service. In Proc. of ICDE 2013.
    Google ScholarLocate open access versionFindings
  • L. N. Lamsal, R. V. Martin, A. V. Donkelaar, M. Steinbacher, E. A. Celarier, E. Bucsela, E. J. Dunlea, and J. P. Pinto (2008), Groundlevel nitrogen dioxide concentrations inferred from the satellite-borne Ozone Monitoring Instrument, J. Geophys. Res., 113, D1630.
    Google ScholarLocate open access versionFindings
  • R. V. Martin. Satellite remote sensing of surface air quality, Atmospheric Environment (2008), doi:10.1016.
    Google ScholarFindings
  • K. Nigam, R. Ghani. Analyzing the Effectiveness and Applicability of Co-Training. In Proc. of CIKM 2000.
    Google ScholarLocate open access versionFindings
  • S. Vardoulakis, B. E. A. Fisher, K. Pericleous, N. Gonzalez-Flesca. Modelling air quality in street canyons: a review. Atmospheric Environment 37 (2003) 155-182.
    Google ScholarFindings
  • J.S. Scire, D.G. Strimaitis and R.J. Yamartino, 2000b: User’s Guide for the CALPUFF Dispersion Model, (Version 5.0), Earth Tech, Inc.
    Google ScholarFindings
  • J. Yuan, Y. Zheng, X. Xie. Discovering regions of different functions in a city using human mobility and POIs. In Proc. of KDD 2012.
    Google ScholarLocate open access versionFindings
  • J. Yuan, Y. Zheng, C. Zhang, X. Xie, G. Sun. An Interactive-Voting based Map Matching Algorithm. In Proc. of MDM 2010.
    Google ScholarLocate open access versionFindings
  • J. Yuan, Y. Zheng, X. Xie, G. Sun. Driving with Knowledge from the Physical World. In Proc. of KDD 2011.
    Google ScholarLocate open access versionFindings
  • Y. Zheng, Y. Liu, J. Yuan, X. Xie. Urban Computing with Taxicabs. In Proc. of UbiComp 2011.
    Google ScholarLocate open access versionFindings
  • F. Zhang, D. Wilkie, Y. Zheng, X. Xie. Sensing the Pulse of Urban Refueling Behavior. In Proc. of UbiComp 2013.
    Google ScholarLocate open access versionFindings
Your rating :
0

 

Tags
Comments
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn
小科