49
Views
0
CrossRef citations to date
0
Altmetric
Research Articles

Prediction of failures in sewer networks using various machine learning classifiers

ORCID Icon
Pages 877-893 | Received 11 Dec 2023, Accepted 09 May 2024, Published online: 30 May 2024
 

ABSTRACT

In sewer networks, failure prediction plays a significant role in operation and maintenance plans of wastewater utilities. This study aims to determine the effective variables on the failures by using feature selection algorithms (FS) and achieve maximum model accuracy with minimum variables. Also, four scenarios based on the suggested FS algorithms were developed. In these scenarios, the best prediction models were investigated using machine learning classifiers (ML) such as neural network classifier (NNC), gradient boosting machine (GBM), random forest (FR), and hybrid model (HM). The classification performance of ML models was evaluated using accuracy, precision, F1_score, and receiver operating characteristics (ROC) curve. The model accuracies ranging from 0.99 for accuracy to 1 for the ROC curve were achieved through ML algorithms. In conclusion, the ML algorithms suggested in this study may be a decision support tool for wastewater utilities in prioritizing the replacement, maintenance, and inspection of sewer pipes.

List of abbreviations

The following symbols are used in this paper

Acronym=

Definition

ANN=

Artificial neural network

AUC=

Area under the curve

Bagging=

Bootstrap aggregating

C=

Concrete

CCTV=

Closed-Circuit television

CNL=

Capacity per unit network length

CNNs=

Convolutional neural networks

CP=

Corrugated pipe with muff

CV=

Cross validation

DI=

Ductile iron

DT=

Decision trees

FS=

Feature selection

FPC=

False positives class

FPR=

False positive rate

GBM=

Gradient boosting machine

GIS=

Geographic information system

GM=

Geometric mean

GP=

Genetic programming

HDPE=

High-density polyethylene

HM=

Hybrid model

ISU=

Kocaeli water and sewerage administration

LASSO=

Least absolute shrinkage and selection operator

LBFGS=

Limited-memory Broyden-Fletcher-Goldfarb-Shanno

LR=

Logistic regression

LSTM=

Long short-term memory

MCC=

Matthew’s correlation coefficient

ML=

Machine learning

MRMR=

Minimum redundancy maximum relevance

NNC=

Neural network classifier

NL=

Network length

OBB=

Out-of-bag

PC=

Pearson correlation

PVC=

Polyvinyl chloride

ReLU=

Rectified linear unit

RC=

Reinforced concrete

RF=

Random forest

ROC=

Receiver operating chacteristics

ST=

Steel

SUEN=

Turkish water institute

SVM=

Support vector machine

TNC=

True negative class

TPC=

True positive class

XGB=

Extreme gradient boosting

Disclosure statement

No potential conflict of interest was reported by the author(s).

Supplementary material

Supplemental data for this article can be accessed https://doi.org/10.1080/1573062X.2024.2360184.

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 61.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 239.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.