TY - JOUR
T1 - Assessing the effectiveness of non-point source pollution models in data-limited urban areas
AU - Shang, Fangze
AU - Tang, Sijie
AU - Wang, Hantao
AU - Yang, Ruiyi
AU - Hou, Zhiqiang
AU - Ping, Yang
AU - Zhang, Zhenzhou
AU - Chen, Huayu
AU - Yu, Yange
AU - Goonetilleke, Ashantha
AU - Jun, Changhyun
AU - Tian, Xin
AU - Wang, Shuo
AU - Wan, Ying
AU - Jiang, Jiping
N1 - Publisher Copyright:
© 2025
PY - 2025/11
Y1 - 2025/11
N2 - Non-point source (NPS) pollution from stormwater runoff has become a major threat to urban water bodies. Rapid and reliable pollution profiling is essential for effective mitigation, yet early-stage stormwater management often lacks detailed drainage data and long-term monitoring, complicating model selection. This study evaluates the performance and practical utility of three widely used NPS modeling approaches—statistical regression, machine learning, and physical process-based models—using a large-scale field monitoring dataset. Improved Export Coefficient Method models achieved high accuracy for TN and COD (R2 > 0.7) but showed overfitting risks due to collinearity. Random Forest Regression predicted COD, TN, NH3-N, and TP well (R2 > 0.6) but struggled with predicting TSS loads. In contrast, SWMM models failed to deliver reliable predictions, even after auto-calibration, underscoring their limitations without prior user expertise. Factor contribution analysis highlighted antecedent dry period, rainfall depth, and land use as key predictors. Nitrogen-related pollutants were more influenced by dry deposition, while phosphorus was more affected by rainfall-triggered wash-off. Finally, a practical multi-criteria evaluation framework, considering accuracy, generalizability, robustness, and cost-efficiency, is proposed to guide model selection under data-limited conditions. This study is expected to promote the utility of machine learning models in practice and provide theoretical support for NPS pollution mitigation in urban areas.
AB - Non-point source (NPS) pollution from stormwater runoff has become a major threat to urban water bodies. Rapid and reliable pollution profiling is essential for effective mitigation, yet early-stage stormwater management often lacks detailed drainage data and long-term monitoring, complicating model selection. This study evaluates the performance and practical utility of three widely used NPS modeling approaches—statistical regression, machine learning, and physical process-based models—using a large-scale field monitoring dataset. Improved Export Coefficient Method models achieved high accuracy for TN and COD (R2 > 0.7) but showed overfitting risks due to collinearity. Random Forest Regression predicted COD, TN, NH3-N, and TP well (R2 > 0.6) but struggled with predicting TSS loads. In contrast, SWMM models failed to deliver reliable predictions, even after auto-calibration, underscoring their limitations without prior user expertise. Factor contribution analysis highlighted antecedent dry period, rainfall depth, and land use as key predictors. Nitrogen-related pollutants were more influenced by dry deposition, while phosphorus was more affected by rainfall-triggered wash-off. Finally, a practical multi-criteria evaluation framework, considering accuracy, generalizability, robustness, and cost-efficiency, is proposed to guide model selection under data-limited conditions. This study is expected to promote the utility of machine learning models in practice and provide theoretical support for NPS pollution mitigation in urban areas.
KW - Export coefficient method
KW - Non-point source
KW - Random forest
KW - Stormwater quality
KW - SWMM
KW - Urban runoff
UR - https://www.scopus.com/pages/publications/105007531977
U2 - 10.1016/j.jhydrol.2025.133636
DO - 10.1016/j.jhydrol.2025.133636
M3 - Journal article
AN - SCOPUS:105007531977
SN - 0022-1694
VL - 661
JO - Journal of Hydrology
JF - Journal of Hydrology
M1 - 133636
ER -