N. S. Moore1, J. H. Laird Jr1, N. Verma1, T. Hager1,2, D. Sritharan1,2, V. Lee1, R. Maresca1,2, S. Chadha1, H. S. M. Park1, and S. Aneja1,2; 1Department of Therapeutic Radiology, Yale School of Medicine, New Haven, CT, 2Center for Outcomes Research and Evaluation (CORE), New Haven, CT
Purpose/Objective(s): Patients with oligometastatic disease (OMD) often benefit from ablative metastasis-directed therapy, yet recognizing these unique patients remains a challenge. OMD status is not currently coded in tumor registry or common data models, limiting efforts to identify patients for clinical care and research. We hypothesized that applying language models to clinical text that is readily available from tumor registries could classify patients with non-small cell lung cancer (NSCLC) as oligometastatic or otherwise. Materials/
Methods: Clinical data including radiology impressions from tumor registry were obtained for 2884 patients with stage IV NSCLC treated at a single institution from 2010-2019. Records were classified based on burden of disease within 6 months of diagnosis as: locoregional, oligometastatic (1-5 metastases), polymetastatic (>5 metastases), or NOS/unknown. Radiology impressions were used to train and evaluate text classifiers from three model architectures: (1) convolutional neural networks (CNN) using scispaCy word vectors, (2) BERT-based transformers, and (3) a large language model, MEDITRON-7B. Classifiers were evaluated on the test cohort comparing ROC AUC, F1-score, precision, sensitivity/recall, computational requirements, and speed. A subgroup analysis of the test cohort included only records that could be correctly classified by blinded clinician review of radiology impressions alone. Shapley values were used to assess model interpretability. Results: In total, 684 patients were classified as locoregional (n=251, 37%), oligometastatic (n=175, 26%), polymetastatic (n=190, 28%), or NOS/unknown (n=64, 10%). Binary models were class-balanced with equal proportions of oligometastatic vs other (n=175 each). Across architectures, binary models were more accurate than multicategory models. Among BERT-based models tested, ClinicalBERT demonstrated best performance. MEDITRON performance improved with increasing number of in-context examples to 5. Evaluated on the test cohort, models demonstrated AUC of 0.45 for CNN, 0.58 for ClinicalBERT, and 0.65 for MEDITRON. For the subgroup of impressions correctly classified by blinded clinician review, AUC was 0.39 for CNN, 0.71 for ClinicalBERT, and 0.61 for MEDITRON. Review of Shapley values for the test set identified terms with clinical relevance. Terms with the greatest mean contribution towards oligometastatic classification included “nodule”, “lobe”, “mass”, and “enhancement”, often in the context of brain MRI. Those towards non-oligometastatic classification included “heterogeneous”, “adenopathy”, and “lymphatic.” Conclusion: Language models applied to radiology impressions identify oligometastatic NSCLC patients with acceptable accuracy. Our work suggests a role for language models to screen patients with OMD for further review based on limited and widely available clinical text, to advance care for a patient cohort that may not be routinely recognized.