TY - JOUR
T1 - Data-driven approaches linking wastewater and source estimation hazardous waste for environmental management
AU - Xie, Wenjun
AU - Yu, Qingyuan
AU - Fang, Wen
AU - Zhang, Xiaoge
AU - Geng, Jinghua
AU - Tang, Jiayi
AU - Jing, Wenfei
AU - Liu, Miaomiao
AU - Ma, Zongwei
AU - Yang, Jianxun
AU - Bi, Jun
N1 - Publisher Copyright:
© The Author(s) 2024.
PY - 2024/12
Y1 - 2024/12
N2 - Industrial enterprises are major sources of contaminants, making their regulation vital for sustainable development. Tracking contaminant generation at the firm-level is challenging due to enterprise heterogeneity and the lack of a universal estimation method. This study addresses the issue by focusing on hazardous waste (HW), which is difficult to monitor automatically. We developed a data-driven methodology to predict HW generation using wastewater big data which is grounded in the availability of this data with widespread application of automatic sensors and the logical assumption that a correlation exists between wastewater and HW generation. We created a generic framework that used representative variables from diverse sectors, exploited a data-balance algorithm to address long-tail data distribution, and incorporated causal discovery to screen features and improve computation efficiency. Our method was tested on 1024 enterprises across 10 sectors in Jiangsu, China, demonstrating high fidelity (R² = 0.87) in predicting HW generation with 4,260,593 daily wastewater data.
AB - Industrial enterprises are major sources of contaminants, making their regulation vital for sustainable development. Tracking contaminant generation at the firm-level is challenging due to enterprise heterogeneity and the lack of a universal estimation method. This study addresses the issue by focusing on hazardous waste (HW), which is difficult to monitor automatically. We developed a data-driven methodology to predict HW generation using wastewater big data which is grounded in the availability of this data with widespread application of automatic sensors and the logical assumption that a correlation exists between wastewater and HW generation. We created a generic framework that used representative variables from diverse sectors, exploited a data-balance algorithm to address long-tail data distribution, and incorporated causal discovery to screen features and improve computation efficiency. Our method was tested on 1024 enterprises across 10 sectors in Jiangsu, China, demonstrating high fidelity (R² = 0.87) in predicting HW generation with 4,260,593 daily wastewater data.
UR - http://www.scopus.com/inward/record.url?scp=85197205517&partnerID=8YFLogxK
U2 - 10.1038/s41467-024-49817-6
DO - 10.1038/s41467-024-49817-6
M3 - Journal article
C2 - 38926394
AN - SCOPUS:85197205517
SN - 2041-1723
VL - 15
JO - Nature Communications
JF - Nature Communications
IS - 1
M1 - 5432
ER -