Memorial Sloan Kettering Cancer Center New York, NY
C. Choi1, J. Jiang2, M. Thor2, A. Rimner3, J. O. Deasy2, J. S. Kim4, and H. Veeraraghavan2; 1Memorial Sloan Kettering Cancer Center, New York, NY, 2Department of Medical Physics, Memorial Sloan Kettering Cancer Center, New York, NY, 3Department of Radiation Oncology, Medical Center – University of Freiburg, Faculty of Medicine, University of Freiburg, German Cancer Consortium (DKTK), partner site DKTK-Freiburg, Freiburg, Germany, 4Medical Physics and Biomedical Engineering Laboratory (MPBEL), Yonsei University College of Medicine, Seoul, Korea, Republic of (South)
Purpose/Objective(s):Severe acute esophagitis (AE) after radiation therapy (RT) for patients with locally advanced non-small cell lung cancer (LA-NSCLC) has previously been linked to decreased overall survival. Identifying patients at higher risk of developing esophagitis before treatment could be used to assist in the RT planning and improved management of patients to reduce the risk of toxicity following RT. The aim of this study was to build an AE prediction model using a large vision foundation model that integrates pretreatment computed tomography (CT) imaging and plan dose. Materials/
Methods: This study included a dataset of 240 patients with LA-NSCLC cancer treated with intensity-modulated RT to 60Gy in 2Gy fractions, with = grade 2 acute esophagitis (AE2 rate:45%). A vision foundation transformer model pre-trained with self-supervision on unlabeled public and institutional (n=10,412 CTs) datasets from multiple diseases was augmented with a fully convolutional single-layer decoder to classify AE. The model was fine-tuned by providing CT and planned dose maps, as well as planning contours of the esophagus to predict AE2 during RT. The model’s prediction was compared against a previously established NTCP model, which is a logistic function of: -3.1 + (Concurrent_chemo[1/0] × 1.50) + (Mean_esophageal_dose[Gy] × 0.069) (PMCID:PMC3783997). Both modeling approaches were undertaken with three-fold stratified cross-validation using identical splits and evaluated using the area under the receiver-operating curve (AUROC). Results: The AI model more accurately predicted AE with an AUROC of 0.71 ± 0.01 compared to the published NTCP model with an AUROC of 0.61 ± 0.11. The p-value derived from the DeLong test comparing two models was determined to be 0.04, indicating a statistically significant difference between the models. Conclusion: Our study shows that a pretrained vision foundation model fine-tuned with a modest number of cases combining pretreatment CT and radiation dose produced a more accurate prediction of AE compared to a published NTCP model.