University of California San Francisco San Francisco, CA
S. C. D. Hampson1, I. Friesner1, J. Ejie1, J. J. Chen2, and J. C. Hong1,2; 1University of California, San Francisco, Bakar Computational Health Sciences Institute, San Francisco, CA, 2Department of Radiation Oncology, University of California San Francisco, San Francisco, CA
Purpose/Objective(s): Unplanned hospital admissions and emergency department visits are a frequent adverse outcome during radiation therapy (RT). The System for High-Intensity Evaluation During Radiation Therapy (SHIELD-RT) study demonstrated that electronic health record (EHR)-based models can predict and facilitate intervention to reduce acute care events. Clinical note text may complement or replace structured data in these predictions. The objective of this study was to compare machine learning approaches to predict, on any day of RT, acute care events within 7 days using combinations of baseline structured and dynamic EHR data. Materials/
Methods: This retrospective cohort study included 14,041 external beam RT courses (250,446 treatment days) from patients with cancer at a single institution from 2013 to 2023. The unit of observation was per day of RT. Baseline EHR predictors included demographics, treatment plan, recent admissions, medical history, medications, laboratory studies, and vital signs prior to treatment. Dynamic predictors from 14 days preceding the prediction included abnormal vitals or labs and natural language processing (NLP)-derived features from SNOMED concepts. NLP features were extracted from free-text clinical notes using the Apache clinical Text Analysis Knowledge Extraction System, a clinical NLP engine. The data were divided into training and hold-out testing sets using a 70/30 time-based split. LASSO logistic regression, random forest (RF), and gradient boosted tree (GTB) models were trained on 1) baseline structured, 2) dynamic, and 3) all features combined to predict the occurrence of an acute care event in the next 7 days. Model performance was evaluated by area under the receiver operating characteristic curve (AUC) for discriminating events and Brier score for calibration. Results: The per-course event rate was 7.0%. In the test cohort, RF had the highest AUC of 0.79 when trained on baseline predictors and 0.78 when using all predictors. All GTB models and the RF trained on dynamic predictors had an AUC of 0.72. The LASSO model had AUC values of 0.74 (dynamic), 0.70 (baseline), and 0.67 (all predictors). All models had lowest Brier scores when trained on dynamic predictors (0.016 for GTB/LASSO, 0.017 for RF). When trained on either baseline or all predictors, GTB and RF had Brier scores of 0.020. LASSO had scores of 0.024 for baseline and 0.027 for all predictors. Conclusion: In patients receiving RT, models trained on dynamic features alone predicted acute care events in the next 7 days with comparable performance to training on baseline features. This demonstrates that a text-based model can be implemented in situations where structured baseline data is not available and can be designed for use on any day of treatment. This approach has added potential to identify and prevent acute care events, which can improve patient outcomes and quality of life during RT.