A novel ensemble learning framework for predicting the causes of water pipe failures

Ridwan Taiwo, Tarek Zayed, Bryan T. Adey

Research output: Journal article publicationJournal articleAcademic researchpeer-review

Abstract

Water pipe failures pose critical challenges for utilities and communities, causing substantial water losses, service disruptions, health risks, and infrastructure damage. Understanding why pipes fail, not just when or where, is essential for developing targeted maintenance strategies. While previous studies have focused on predicting failure occurrence, this research addresses the critical gap in predicting the specific underlying causes of water pipe failures. The study develops and compares advanced ensemble learning models to classify failures into four distinct categories: external loading, corrosion, faulty material, and faulty workmanship. The research introduces stacking and voting classifiers as novel approaches for this domain and benchmarks them against five established ensemble algorithms (AdaBoost, Random Forest, XGBoost, LightGBM, and CatBoost) while using the Tree structured Parzen Estimator (TPE) for hyperparameter optimization. Using a comprehensive dataset from Hong Kong's water distribution network, the study evaluates model performance through ten metrics, including macro & weighted metrics and uncertainty quantification via entropy. The results show that TPE improves baseline model performance, with XGBoost's macro F1 score increasing by 10.3 %. A systematic selection approach using the Copeland algorithm identifies the optimized voting classifier as the superior model, achieving a macro F1 of 0.634 and weighted F1 of 0.813. SHAP analysis reveals pressure, material type, traffic load, precipitation, and pipe diameter as the most influential predictive factors, providing actionable insights for decision makers. The research culminates in a deployed web application that enables utilities to predict failure causes for individual pipes, facilitating targeted interventions that can reduce maintenance costs, extend infrastructure lifespan, and improve service reliability. This integrated approach to prediction, interpretation, and application represents a significant advancement for water infrastructure management.

Original languageEnglish
Article number111320
JournalReliability Engineering and System Safety
Volume264
DOIs
Publication statusPublished - Dec 2025

Keywords

  • Causes of pipe failure
  • Ensemble algorithms
  • Prediction model
  • SHAP
  • Stacking classifier
  • Voting classifier
  • Water distribution network
  • Water pipe failure
  • WDN

ASJC Scopus subject areas

  • Safety, Risk, Reliability and Quality
  • Industrial and Manufacturing Engineering

Fingerprint

Dive into the research topics of 'A novel ensemble learning framework for predicting the causes of water pipe failures'. Together they form a unique fingerprint.

Cite this