J. Zhao1, E. J. Vaios1, Z. Yang2, S. H. Robertson1, J. Ginn1, K. Lu1, F. F. Yin2, Z. J. Reitman1, J. P. Kirkpatrick1, S. R. Floyd3, P. Fecci1, and C. Wang1; 1Duke University, Durham, NC, 2Duke Kunshan University, Kunshan, China, 3Department of Radiation Oncology, Duke University Medical Center, Durham, NC
Purpose/Objective(s): Stereotactic radiosurgery (SRS) is a widely used treatment for brain metastases (BM), but the risk of radionecrosis poses a challenge in post-SRS management. Given the lack of non-invasive imaging methods for distinguishing radionecrosis from recurrence, our goal was to design a deep ensemble learning model that integrates patient clinical features and genomic profiles to identify radionecrosis in BM patients showing post-SRS radiographic progression. Materials/
Methods: We studied 90 BMs from 62 non-small cell lung cancer (NSCLC) patients, with 27 biopsy-confirmed post-SRS local recurrences. Clinical features, including patient age, BM location, SRS prescription, KPS, chemo/targeted therapy, immunotherapy, steroid, and genomic features (specifically, seven NSCLC driver mutations), were collected. We first analyzed the 3-month post-SRS high-resolution T1+c volume: a 3D volume-of-interest (VOI) centered on each BM was determined based on the SRS V60% isodose volume. A deep neural network (DNN) resembling the U-nets encoding path was trained for radionecrosis/recurrence prediction using the VOI as input. Preceding the binary prediction output, latent variables within the DNN were extracted as 1024 deep features. An ensemble learning model was then developed, comprising two sub-models that fused deep features with clinical (D+C) or genomic (D+G) features. We employed our previous positional encoding (PE) method to optimally fuse the low-dimensional clinical/genomic features with the high-dimensional image features to conquer the ‘curse of dimensionality’. Following this, the post-fusion feature in each sub-model yielded a logit result (i.e., radionecrosis/recurrence) after traversing fully connected layers. The ensembles final output was the synthesized result of these two sub-models’ logits via logistic regression. Model training employed an 8:2 train/test split, and 10 model versions were developed for robustness evaluation. Performance metrics were compared against image-only DNN model and D+C and D+G sub-models. Results: The deep ensemble model showed excellent performance on the test set, with ROCAUC = 0.91±0.04, sensitivity = 0.87±0.16, specificity = 0.86±0.08, and accuracy = 0.87±0.04. This outperformed the image-only DNN result (ROCAUC: 0.71±0.05, sensitivity: 0.66±0.32), D+C result (ROCAUC: 0.82±0.03, sensitivity: 0.67±0.17), and D+G result (ROCAUC: 0.83±0.02, sensitivity: 0.76±0.22). Conclusion: The radiogenomic deep ensemble model achieved the best performance among existing models in distinguishing BM radionecrosis from recurrence using 3-month post-SRS T1+c MR images, clinical features, and genomic features. This highlights the potential of artificial intelligence in clinical decision-making for BM management, warranting further investigation into its clinical applications.