TY - JOUR
T1 - A gradient boost approach for predicting near-road ultrafine particle concentrations using detailed traffic characterization
AU - Xu, Junshi
AU - Wang, An
AU - Schmidt, Nicole
AU - Adams, Matthew
AU - Hatzopoulou, Marianne
N1 - Funding Information:
The authors would like to mention the crucial contributions of Mingqian Zhang for her assistance in data collection and development of a video processing system for traffic detection, tracking, and counting. This study was funded by a grant from the X-Seed program at the University of Toronto , jointly held by Professors Matthew Adams and Marianne Hatzopoulou.
Publisher Copyright:
© 2020 Elsevier Ltd
PY - 2020/10
Y1 - 2020/10
N2 - This study investigates the influence of meteorology, land use, built environment, and traffic characteristics on near-road ultrafine particle (UFP) concentrations. To achieve this objective, minute-level UFP concentrations were measured at various locations along a major arterial road in the Greater Toronto Area (GTA) between February and May 2019. Each location was visited five times, at least once in the morning, mid-day, and afternoon. Each visit lasted for 30 min, resulting in 2.5 h of minute-level data collected at each location. Local traffic information, including vehicle class and turning movements, were processed using computer vision techniques. The number of fast-food restaurants, cafes, trees, traffic signals, and building footprint, were found to have positive impacts on the mean UFP, while distance to the closest major road was negatively associated with UFP. We employed the Extreme Gradient Boosting (XGBoost) method to develop prediction models for UFP concentrations. The Shapley additive explanation (SHAP) measures were used to capture the influence of each feature on model output. The model results demonstrated that minute-level counts of local traffic from different directions had significant impacts on near-road UFP concentrations, model performance was robust under random cross-validation as coefficients of determination (R2) ranged from 0.63 to 0.69, but it revealed weaknesses when data at specific locations were eliminated from the training dataset. This result indicates that proper cross-validation techniques should be developed to better evaluate machine learning models for air quality predictions.
AB - This study investigates the influence of meteorology, land use, built environment, and traffic characteristics on near-road ultrafine particle (UFP) concentrations. To achieve this objective, minute-level UFP concentrations were measured at various locations along a major arterial road in the Greater Toronto Area (GTA) between February and May 2019. Each location was visited five times, at least once in the morning, mid-day, and afternoon. Each visit lasted for 30 min, resulting in 2.5 h of minute-level data collected at each location. Local traffic information, including vehicle class and turning movements, were processed using computer vision techniques. The number of fast-food restaurants, cafes, trees, traffic signals, and building footprint, were found to have positive impacts on the mean UFP, while distance to the closest major road was negatively associated with UFP. We employed the Extreme Gradient Boosting (XGBoost) method to develop prediction models for UFP concentrations. The Shapley additive explanation (SHAP) measures were used to capture the influence of each feature on model output. The model results demonstrated that minute-level counts of local traffic from different directions had significant impacts on near-road UFP concentrations, model performance was robust under random cross-validation as coefficients of determination (R2) ranged from 0.63 to 0.69, but it revealed weaknesses when data at specific locations were eliminated from the training dataset. This result indicates that proper cross-validation techniques should be developed to better evaluate machine learning models for air quality predictions.
KW - Cross-validation
KW - K-means clustering
KW - Local traffic
KW - Machine learning
KW - Short-term fixed monitoring
UR - http://www.scopus.com/inward/record.url?scp=85086433299&partnerID=8YFLogxK
U2 - 10.1016/j.envpol.2020.114777
DO - 10.1016/j.envpol.2020.114777
M3 - Journal article
C2 - 32540592
AN - SCOPUS:85086433299
SN - 0269-7491
VL - 265
JO - Environmental Pollution
JF - Environmental Pollution
M1 - 114777
ER -