TY - JOUR
T1 - Explainable ensemble models for predicting wall thickness loss of water pipes
AU - Taiwo, Ridwan
AU - Yussif, Abdul Mugis
AU - Ben Seghier, Mohamed El Amine
AU - Zayed, Tarek
N1 - Publisher Copyright:
© 2024 THE AUTHORS
PY - 2024/4
Y1 - 2024/4
N2 - Water Distribution Networks (WDNs) are susceptible to pipe failures with significant consequences. Predicting wall-thickness loss in pipes is vital for proactive maintenance and asset management. This study develops optimized, explainable machine learning models for this purpose. Data from four WDNs located in Canada and the USA are collected and preprocessed. Decision Tree, Random Forest (RF), XGBoost, LightGBM, and CatBoost are employed, with optimized hyperparameters via Tree-Structured Parzen Estimator. The proposed framework performance is assessed using dissimilarity-based and similarity-based metrics. Hyperparameter optimization substantially enhances predictive performance such that the mean absolute error of RF improved by 20.51%. Based on the evaluation metrics, the Copeland algorithm was employed to rank the models, and CatBoost emerged as the best-performing model with a Copeland score of 4, followed by XGBoost and RF. The Taylor Diagram offers a visual representation of the linear proportionality between observed and predicted values across various models, with CatBoost and XGBoost showing strong alignment. SHAP analysis identifies age, diameter, and length as key contributors. The optimized models proactively identify potential pipe failures, enhancing maintenance and WDN management.
AB - Water Distribution Networks (WDNs) are susceptible to pipe failures with significant consequences. Predicting wall-thickness loss in pipes is vital for proactive maintenance and asset management. This study develops optimized, explainable machine learning models for this purpose. Data from four WDNs located in Canada and the USA are collected and preprocessed. Decision Tree, Random Forest (RF), XGBoost, LightGBM, and CatBoost are employed, with optimized hyperparameters via Tree-Structured Parzen Estimator. The proposed framework performance is assessed using dissimilarity-based and similarity-based metrics. Hyperparameter optimization substantially enhances predictive performance such that the mean absolute error of RF improved by 20.51%. Based on the evaluation metrics, the Copeland algorithm was employed to rank the models, and CatBoost emerged as the best-performing model with a Copeland score of 4, followed by XGBoost and RF. The Taylor Diagram offers a visual representation of the linear proportionality between observed and predicted values across various models, with CatBoost and XGBoost showing strong alignment. SHAP analysis identifies age, diameter, and length as key contributors. The optimized models proactively identify potential pipe failures, enhancing maintenance and WDN management.
KW - Ensemble learning
KW - Machine learning
KW - SHAP
KW - Wall thickness
KW - Water pipelines
UR - http://www.scopus.com/inward/record.url?scp=85184675398&partnerID=8YFLogxK
U2 - 10.1016/j.asej.2024.102630
DO - 10.1016/j.asej.2024.102630
M3 - Journal article
AN - SCOPUS:85184675398
SN - 2090-4479
VL - 15
JO - Ain Shams Engineering Journal
JF - Ain Shams Engineering Journal
IS - 4
M1 - 102630
ER -