Watermain breaks and data: the intricate relationship between data availability and accuracy of predictions: Urban Water Journal: Vol 17, No 2

463

Views

CrossRef citations to date

Altmetric

ABSTRACT

Many water utilities are facing a crisis of aging infrastructure. Aging pipes are deteriorating, and pipe breaks are increasing. A variety of pipe break prediction models have been developed to identifying which pipes are most likely to break next, in order to assist utilities in prioritizing pipe replacement. This paper investigates the role of data in pipe break prediction model accuracy. A gradient boosting decision tree machine learning model, a Weibull proportional hazard probabilistic model and two ranking models (based on ‘age of pipe’ and ‘previous-break’) were calibrated using a various number of pipes, years of break records and input variables. The results indicate how the different model types are impacted by data limitations. Overall, this study finds the Age-based approach to be inaccurate, and the XGBoost machine learning model demonstrates superior predictive capability when the training dataset contains more than 5 years of break records and 2,000 or more pipes.

KEYWORDS:

Acknowledgements

This research was funded by the Natural Sciences and Engineering Research Council (NSERC). The authors are grateful for the help and data provided by the utility described in the case study of this report. In this research, data were processed with R-studio using R-language.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Data availability statement

All data used during the study are confidential in nature and cannot be provided by agreement with the municipalities due to their concern with the security of their distribution system.

Additional information

Funding

This work was supported by the Natural Sciences and Engineering Research Council of Canada.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Order Reprints Request Corporate Permissions

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

Request Academic Permissions

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.

Watermain breaks and data: the intricate relationship between data availability and accuracy of predictions

Information for

Open access

Opportunities

Help and information

Watermain breaks and data: the intricate relationship between data availability and accuracy of predictions

ABSTRACT

Acknowledgements

Disclosure statement

Data availability statement

Additional information

Funding

Reprints and Corporate Permissions

Academic Permissions

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date

Your download is now in progress and you may close this window

Login or register to access this feature