Abstract
Water pipe failures pose critical challenges for utilities and communities, causing substantial water losses, service disruptions, health risks, and infrastructure damage. Understanding why pipes fail, not just when or where, is essential for developing targeted maintenance strategies. While previous studies have focused on predicting failure occurrence, this research addresses the critical gap in predicting the specific underlying causes of water pipe failures. The study develops and compares advanced ensemble learning models to classify failures into four distinct categories: external loading, corrosion, faulty material, and faulty workmanship. The research introduces stacking and voting classifiers as novel approaches for this domain and benchmarks them against five established ensemble algorithms (AdaBoost, Random Forest, XGBoost, LightGBM, and CatBoost) while using the Tree structured Parzen Estimator (TPE) for hyperparameter optimization. Using a comprehensive dataset from Hong Kong's water distribution network, the study evaluates model performance through ten metrics, including macro & weighted metrics and uncertainty quantification via entropy. The results show that TPE improves baseline model performance, with XGBoost's macro F1 score increasing by 10.3 %. A systematic selection approach using the Copeland algorithm identifies the optimized voting classifier as the superior model, achieving a macro F1 of 0.634 and weighted F1 of 0.813. SHAP analysis reveals pressure, material type, traffic load, precipitation, and pipe diameter as the most influential predictive factors, providing actionable insights for decision makers. The research culminates in a deployed web application that enables utilities to predict failure causes for individual pipes, facilitating targeted interventions that can reduce maintenance costs, extend infrastructure lifespan, and improve service reliability. This integrated approach to prediction, interpretation, and application represents a significant advancement for water infrastructure management.
| Original language | English |
|---|---|
| Article number | 111320 |
| Journal | Reliability Engineering and System Safety |
| Volume | 264 |
| DOIs | |
| Publication status | Published - Dec 2025 |
Keywords
- Causes of pipe failure
- Ensemble algorithms
- Prediction model
- SHAP
- Stacking classifier
- Voting classifier
- Water distribution network
- Water pipe failure
- WDN
ASJC Scopus subject areas
- Safety, Risk, Reliability and Quality
- Industrial and Manufacturing Engineering