Forty eight potential outliers in air pollution measurements taken simultaneously in Detroit, Michigan, USA and Windsor, Ontario, Canada in 2008 and 2009 were identified using four independent methods: box plots, variogram clouds, difference maps, and the Local Moran's I statistic. These methods were subsequently used in combination to reduce and select a final set of 13 outliers for nitrogen dioxide (NO2), volatile organic compounds (VOCs), total benzene, toluene, ethyl benzene, and xylene (BTEX), and particulate matter in two size fractions (PM2.5 and PM10). The selected outliers were excluded from the measurement datasets and used to revise air pollution models. In addition, a set of temporally-scaled air pollution models was generated using time series measurements from community air quality monitors, with and without the selected outliers. The influence of outlier exclusion on associations with asthma exacerbation rates aggregated at a postal zone scale in both cities was evaluated. Results demonstrate that the inclusion or exclusion of outliers influences the strength of observed associations between intraurban air quality and asthma exacerbation in both cities. The box plot, variogram cloud, and difference map methods largely determined the final list of outliers, due to the high degree of conformity among their results. The Moran's I approach was not useful for outlier identification in the datasets studied. Removing outliers changed the spatial distribution of modeled concentration values and derivative exposure estimates averaged over postal zones. Overall, associations between air pollution and acute asthma exacerbation rates were weaker with outliers removed, but improved with the addition of temporal information. Decreases in statistically significant associations between air pollution and asthma resulted, in part, from smaller pollutant concentration ranges used for linear regression. Nevertheless, the practice of identifying outliers through congruence among multiple methods strengthens confidence in the analysis of outlier presence and influence in environmental datasets.
- Air pollution
- Intraurban variation