Enhancement of prostate cancer diagnosis by machine learning techniques: an algorithm development and validation study

Peter Ka Fung Chiu, Xiao Shen, Cho Lik Ho, Chi Ho Leung, Chi Fai Ng, Kup Sze Choi, Jeremy Yuen Chun Teoh

Research output: Journal article publicationJournal articleAcademic researchpeer-review

14 Citations (Scopus)


Background: To investigate the value of machine learning(ML) in enhancing prostate cancer(PCa) diagnosis. Methods: Consecutive systematic prostate biopsies performed from Jan 2003–June 2017 were used as the training cohort, and prospective biopsies performed from July 2017-November 2019 were used as validation cohort. Men were included if PSA was 0.4–50 ng/mL, and information of digital rectal examination (DRE), Transrectal ultrasound(TRUS) prostate volume, TRUS abnormality were known. Clinically significant PCa(csPCa) was defined as Gleason 3 + 4 or above cancers. Area-under-curve (AUC) of receiver-operating characteristics (ROC) was compared between PSA, PSA density, European Randomized Study of Screening for Prostate Cancer (ERSPC) risk calculator (ERSPC-RC), and various ML techniques using PSA, DRE and TRUS information. ML techniques used included XGBoost, LightGBM, Catboost, Support vector machine (SVM), Logistic regression (LR), and Random Forest (RF), where cost sensitive learning was applied. Results: Training and validation cohorts included 3881 and 778 consecutive men, respectively. RF model performed better than other ML techniques and PSA, PSA density and ERSPC-RC for prediction of PCa or csPCa in the validation cohort. In csPCa prediction, AUC of PSA, PSA density, ERSPC-RC and RF was 0.71, 0.80, 0.83 and 0.88 respectively. At 90–95% sensitivity for csPCa, RF model achieved a negative predictive value (NPV) of 97.5–98.0% and avoided 38.3–52.2% unnecessary biopsies. Decision curve analyses (DCA) showed RF model provided net clinical benefit over PSA, PSA density and ERSPC-RC. Conclusion: By using the same clinical parameters, ML techniques performed better than ERSPC-RC or PSA density in csPCa predictions, and could avoid up to 50% unnecessary biopsies.

Original languageEnglish
JournalProstate Cancer and Prostatic Diseases
Publication statusAccepted/In press - 2021

ASJC Scopus subject areas

  • Oncology
  • Urology
  • Cancer Research


Dive into the research topics of 'Enhancement of prostate cancer diagnosis by machine learning techniques: an algorithm development and validation study'. Together they form a unique fingerprint.

Cite this