A. Rosich, J. C. Ferrer, S. M. Guerreros, E. Rivero, M. Torres, S. Roldan Sr, M. Giordano, G. Paolini, K. Ochandorena, L. Ricagni, and F. Lorenzo; Unidad Academica de Radioterapia, Universidad de la Republica, Montevideo, Uruguay
Purpose/Objective(s): This study aims to train a language model to suggest treatments for early-stage breast cancer patients with a level of accuracy comparable to experienced radiotherapists. The goal is to demonstrate the potential of these tools as complementary aids in clinical decision-making. Key treatment aspects like radiotherapy fractionation, surgical bed boosting, and lymph node inclusion will be assessed for decision coherence.Materials/
Methods: Ethical issues: The current study was conducted in compliance with international ethical standards applied to biomedical research (i.e., the MERCOSUR Standards on the Regulation of Clinical Trials and the Declaration of Helsinki of the World Medical Association [including its October 2013 amendment]). The study was registered with the Ministry of Public Health (MSP) of Uruguay under number 8987629. A large language model was trained using the Generative Pretrained Transformers (GPT-4) Application Programming Interface (API), fueled by a collection of research articles on early-stage breast cancer. A total of 43 relevant articles were included and selected for model training. Subsequently, 15 senior radiotherapists were convened for the study. Each radiotherapist completed an online form where they made treatment decisions for 10 fictional clinical cases. For each response received from the group of specialists, the AI was requested to generate an additional response for the same set of patients. The clinical cases addressed various aspects, including age, histological variables, different TNM stages, presence of estrogen receptors, progesterone receptors, and HER2 receptors, Ki76, final histological grade, presence of lymphovascular invasion, type of surgery, margin status after surgery, and the need for margin extension. Additionally, the number of nodes removed, node positivity, and whether chemotherapy was administered, followed by restaging, were mentioned. To measure the agreement between the models decisions and those of the specialists, the chi-square test and Cohens Kappa index were used Results: The results show AI and humans in the decision to pursue standard radiotherapy hypofractionation, with AI voting Yes 100% of the time compared to humans at 98.55% (p = 0.156; 95% CI, Kappa = 0.663). In the case of inclusion of a surgical bed boost, AI and humans voted at 69.57% and 70.29% respectively (p = 0.896; 95% CI, Kappa = 0.983). For lymph nodes inclusion in the treatment, AI and human; 27.54% and 34.06% respectively (p = 0.241; 95% CI, Kappa = 0.848). Conclusion: This study demonstrates the potential of AI as an effective tool to support therapeutic decision-making in radiotherapy, suggesting its applicability to enhance efficiency and precision in treatment planning.