TY - JOUR
T1 - Semi-supervised learning framework for oil and gas pipeline failure detection
AU - Alobaidi, Mohammad H.
AU - Meguid, Mohamed A.
AU - Zayed, Tarek
N1 - Publisher Copyright:
© 2022, The Author(s).
PY - 2022/12
Y1 - 2022/12
N2 - Quantifying failure events of oil and gas pipelines in real- or near-real-time facilitates a faster and more appropriate response plan. Developing a data-driven pipeline failure assessment model, however, faces a major challenge; failure history, in the form of incident reports, suffers from limited and missing information, making it difficult to incorporate a persistent input configuration to a supervised machine learning model. The literature falls short on the development of appropriate solutions to utilize incomplete databases and incident reports in the pipeline failure problem. This work proposes a semi-supervised machine learning framework which mines existing oil and gas pipeline failure databases. The proposed cluster-impute-classify (CIC) approach maps a relevant subset of the failure databases through which missing information in the incident report is reconstructed. A classifier is then trained on the fly to learn the functional relationship between the descriptors from a diverse feature set. The proposed approach, presented within an ensemble learning architecture, is easily scalable to various pipeline failure databases. The results show up to 91% detection accuracy and stable generalization ability against increased rate of missing information.
AB - Quantifying failure events of oil and gas pipelines in real- or near-real-time facilitates a faster and more appropriate response plan. Developing a data-driven pipeline failure assessment model, however, faces a major challenge; failure history, in the form of incident reports, suffers from limited and missing information, making it difficult to incorporate a persistent input configuration to a supervised machine learning model. The literature falls short on the development of appropriate solutions to utilize incomplete databases and incident reports in the pipeline failure problem. This work proposes a semi-supervised machine learning framework which mines existing oil and gas pipeline failure databases. The proposed cluster-impute-classify (CIC) approach maps a relevant subset of the failure databases through which missing information in the incident report is reconstructed. A classifier is then trained on the fly to learn the functional relationship between the descriptors from a diverse feature set. The proposed approach, presented within an ensemble learning architecture, is easily scalable to various pipeline failure databases. The results show up to 91% detection accuracy and stable generalization ability against increased rate of missing information.
UR - http://www.scopus.com/inward/record.url?scp=85135789117&partnerID=8YFLogxK
U2 - 10.1038/s41598-022-16830-y
DO - 10.1038/s41598-022-16830-y
M3 - Journal article
C2 - 35962052
AN - SCOPUS:85135789117
SN - 2045-2322
VL - 12
JO - Scientific Reports
JF - Scientific Reports
IS - 1
M1 - 13758
ER -