TY - JOUR
T1 - Identification and influence of spatio-temporal outliers in urban air quality measurements
AU - O'Leary, Brendan
AU - Reiners, John J.
AU - Xu, Xiaohong
AU - Lemke, Lawrence D.
N1 - Funding Information:
Funding sources and IRB approval: Support for this project was provided by a grant from the W.K. Kellogg Foundation , P3018205 , and through a Wayne State University Career Development Grant. Institutional Review Board approval for the original asthma study (reported in Lemke et al., 2014) was obtained from the participating institutions (IRB# 073508B3X) and all personal and health information was de-identified and coded to protect the identity of individuals. IRB approval for a subsequent study, which funded the outlier evaluation and temporal scaling of air pollution reported in this paper, was obtained through Wayne State University (IRB# 091012MP2E).
Publisher Copyright:
© 2016 The Authors
PY - 2016/12/15
Y1 - 2016/12/15
N2 - Forty eight potential outliers in air pollution measurements taken simultaneously in Detroit, Michigan, USA and Windsor, Ontario, Canada in 2008 and 2009 were identified using four independent methods: box plots, variogram clouds, difference maps, and the Local Moran's I statistic. These methods were subsequently used in combination to reduce and select a final set of 13 outliers for nitrogen dioxide (NO2), volatile organic compounds (VOCs), total benzene, toluene, ethyl benzene, and xylene (BTEX), and particulate matter in two size fractions (PM2.5 and PM10). The selected outliers were excluded from the measurement datasets and used to revise air pollution models. In addition, a set of temporally-scaled air pollution models was generated using time series measurements from community air quality monitors, with and without the selected outliers. The influence of outlier exclusion on associations with asthma exacerbation rates aggregated at a postal zone scale in both cities was evaluated. Results demonstrate that the inclusion or exclusion of outliers influences the strength of observed associations between intraurban air quality and asthma exacerbation in both cities. The box plot, variogram cloud, and difference map methods largely determined the final list of outliers, due to the high degree of conformity among their results. The Moran's I approach was not useful for outlier identification in the datasets studied. Removing outliers changed the spatial distribution of modeled concentration values and derivative exposure estimates averaged over postal zones. Overall, associations between air pollution and acute asthma exacerbation rates were weaker with outliers removed, but improved with the addition of temporal information. Decreases in statistically significant associations between air pollution and asthma resulted, in part, from smaller pollutant concentration ranges used for linear regression. Nevertheless, the practice of identifying outliers through congruence among multiple methods strengthens confidence in the analysis of outlier presence and influence in environmental datasets.
AB - Forty eight potential outliers in air pollution measurements taken simultaneously in Detroit, Michigan, USA and Windsor, Ontario, Canada in 2008 and 2009 were identified using four independent methods: box plots, variogram clouds, difference maps, and the Local Moran's I statistic. These methods were subsequently used in combination to reduce and select a final set of 13 outliers for nitrogen dioxide (NO2), volatile organic compounds (VOCs), total benzene, toluene, ethyl benzene, and xylene (BTEX), and particulate matter in two size fractions (PM2.5 and PM10). The selected outliers were excluded from the measurement datasets and used to revise air pollution models. In addition, a set of temporally-scaled air pollution models was generated using time series measurements from community air quality monitors, with and without the selected outliers. The influence of outlier exclusion on associations with asthma exacerbation rates aggregated at a postal zone scale in both cities was evaluated. Results demonstrate that the inclusion or exclusion of outliers influences the strength of observed associations between intraurban air quality and asthma exacerbation in both cities. The box plot, variogram cloud, and difference map methods largely determined the final list of outliers, due to the high degree of conformity among their results. The Moran's I approach was not useful for outlier identification in the datasets studied. Removing outliers changed the spatial distribution of modeled concentration values and derivative exposure estimates averaged over postal zones. Overall, associations between air pollution and acute asthma exacerbation rates were weaker with outliers removed, but improved with the addition of temporal information. Decreases in statistically significant associations between air pollution and asthma resulted, in part, from smaller pollutant concentration ranges used for linear regression. Nevertheless, the practice of identifying outliers through congruence among multiple methods strengthens confidence in the analysis of outlier presence and influence in environmental datasets.
KW - Air pollution
KW - Asthma
KW - Intraurban variation
KW - Outlier
KW - Spatio-temporal
UR - http://www.scopus.com/inward/record.url?scp=84982252282&partnerID=8YFLogxK
U2 - 10.1016/j.scitotenv.2016.08.031
DO - 10.1016/j.scitotenv.2016.08.031
M3 - Article
C2 - 27552730
AN - SCOPUS:84982252282
SN - 0048-9697
VL - 573
SP - 55
EP - 65
JO - Science of the Total Environment
JF - Science of the Total Environment
ER -