J. Winter1, L. Conroy1, M. Ramotar2, A. T. Santiago1, C. Catton3, P. Chung1, C. McIntosh4, T. G. Purdie1, and A. Berlin2; 1Radiation Medicine Program, Princess Margaret Cancer Centre, University Health Network, Toronto, ON, Canada, 2Department of Radiation Oncology, Princess Margaret Cancer Centre, University of Toronto, Toronto, ON, Canada, 3Department of Radiation Oncology, University of Toronto, Toronto, ON, Canada, 4Toronto General Hospital Research Institute, University Health Network, Toronto, ON, Canada
Purpose/Objective(s): Machine learning (ML) radiotherapy (RT) treatment planning has shown improved efficiency while maintaining quality. However, there has been no prospective evaluation of patient outcomes when using ML as standard-of-care for RT planning, thereby limiting the assessment of its value proposition. We hypothesized that minimal clinical differences in genitourinary (GU) and gastrointestinal (GI) toxicities exist between ML- and human-generated RT plans during prospective application. Materials/
Methods: We prospectively evaluated ML- and human-generated plans for curative-intent prostate RT (60 Gy in 20 fractions) in a cohort of 113 consecutive patients treated between November 2019 and June 2022. We employed a previously institutionally developed, validated, and clinically implemented dose prediction ML model functioning within a commercial RT planning system. ML planning, without any manual adjustments, was the default planning method used in all cases. Radiation oncologists either approved the ML plan or requested an alternative human-generated plan for direct comparison, and then selected the preferred plan for treatment. GU and GI toxicities with minimum follow-up of 180 days were collected for all patients. We performed a toxicity-free survival Kaplan-Meier analysis for grade 2+ GU and grade 2+ GI toxicities between ML- and human-generated plans, and comparisons were based on log-rank tests. Results: In the prospective standard of care ML deployment study, radiation oncologists selected ML plans for clinical treatment in 86 cases (76%) and selected human plans in 27 cases (24%). For cases in which a human-generated plan was requested, the ML plan was selected for treatment in only one case. In terms of treatment outcomes, there were no treatment-related grade 2+ GI toxicities observed and no significant differences in toxicity-free survival were observed for GU grade 2+ toxicities between ML- and human-generated plans (p = 0.39). Conclusion: This is the first study demonstrating that dose prediction ML planning maintains low levels of toxicity in curative-intent prostate cancer and encourages the clinical translation of this technology into practice. When appropriately validated and deployed, ML planning can retain good clinical outcomes while improving efficiencies and can be safely used as standard of care applicable to the majority of patients, with a human-in-loop strategy.