Development and validation of an imageless machine-learning algorithm for the initial screening of prostate cancer

Nicolas Martelin; Brian De Witt; Benjamin Chen; Pascal Eschwège

doi:10.1002/pros.24703

Development and validation of an imageless machine-learning algorithm for the initial screening of prostate cancer

Prostate. 2024 Apr 4. doi: 10.1002/pros.24703. Online ahead of print.

Authors

Nicolas Martelin¹, Brian De Witt¹, Benjamin Chen¹, Pascal Eschwège^{2

3}

Affiliations

¹ Prostperia, Nancy, France.
² Urology Department, Nancy University Hospital, Vandœuvre-lès-Nancy, France.
³ Unité de Biologie des Tumeurs, CRAN UMR 7039 CNRS, Institut de cancérologie de Lorraine, Vandoeuvre-Lès-Nancy, France.

PMID: 38571454
DOI: 10.1002/pros.24703

Abstract

Purpose: Prostate specific antigen (PSA) testing is a low-cost screening method for prostate cancer (PCa). However, its accuracy is limited. While progress is being made using medical imaging for PCa screening, PSA testing can still be improved as an easily accessible first step in the screening process. We aimed to develop and validate a new model by further personalizing the analysis of PSA with demographic, medical history, lifestyle parameters, and digital rectal examination (DRE) results.

Methods: Using data from 34,224 patients in the screening arm of the PLCO trial (22,188 for the training set and 12,036 for the validation set), we applied a gradient-boosting model whose features (Model 1) were one PSA value and the personal variables available in the PLCO trial except those that signaled an ex-ante assumption of PCa. A second algorithm (Model 2) included a DRE result. The primary outcome was the occurrence of PCa, while the aggressiveness of PCa was a secondary outcome. ROC analyses were used to compare both models to other initial screening tests.

Results: The areas under the curve (AUC) for Model 2 was 0.894 overall and 0.908 for patients with a suspicious DRE, compared to 0.808 for PSA for patients with a suspicious DRE. The AUC for Model 1 was 0.814 compared to 0.821 for PSA. Model 2 predicted 58% more high-risk PCa than PSA ≥4 combined with an abnormal DRE and had a positive predictive value of 74.7% (vs. 50.6%).

Conclusion: Personalizing the interpretation of PSA values and DRE results with a gradient-boosting model showed promising results as a potential novel, low-cost method for the initial screening of PCa. The importance of DRE, when included in such a model, was also highlighted.

Keywords: DRE; PSA; artificial intelligence; initial screening; machine learning; prostate cancer.