E. Shin1, J. Choi2, T. Hung3, C. Poon3, Y. Yu4, and J. J. Kang5; 1Geneva School, New York, NY, 2SUNY Downstate College of Medicine, Brooklyn, NY, 3Memorial Sloan Kettering Cancer Center, New York, NY, 4Department of Radiation Oncology, Memorial Sloan Kettering Cancer Center, New York, NY, 5Yale University Department of Radiation Oncology, New Haven, CT
Purpose/Objective(s): The gold standard test to identify human papillomavirus related oropharyngeal cancer (HPV+ OPC) is mRNA in situ hybridization (ISH). p16INK4a (p16) immunohistochemistry (IHC) is a surrogate for HPV+ OPC, and is often the only diagnostic test done given its high sensitivity and wide availability. As both p16 and HPV testing are not required for routine management, the prognosis of discordant tumors is uncertain. We aimed to assess the feasibility of natural language processing (NLP) to extract p16 and HPV biomarker status. We hypothesized that p16 and HPV discordant tumors have different outcomes than can be predicted by p16 status alone. Materials/
Methods: We included a series of OPC patients from 7/1994-6/2016 with digitized pathology records from our multi-institutional database. NLP was used to code biomarker status from free-text pathology reports. All reports were manually evaluated by two independent reviewers to derive gold standard classification. Patients were excluded if not treated with curative radiation, both p16 IHC and HPV ISH were not performed, or if either p16 or HPV was equivocal. In total, 583 patients were included with a median follow up of 66 months (1-280). The Kaplan-Meier method was used to estimate progression-free (PFS) and overall survival (OS). Results: p16 and HPV results were automatically extracted from a majority (60%) of reports, of which NLP accurately classified 99% of p16 and 96% of HPV results. Positive predictive value (precision), sensitivity (recall), and F-score for p16/HPV were: 98/84%, 97/91%, and 97/86%. Four groups were compared: p16-negative/HPV-negative (p16-HPV-, n=50), p16-/HPV-positive (p16-HPV+, n=12), p16-positive/HPV- (p16+HPV-, n=84), and p16+HPV+ (n=437). 16.5% of tumors were discordant (p16+HPV- or p16-HPV+) and 16% of p16+ patients were HPV-. 2/5-year OS (p<0.001): 93/89% P16+HPV+, 94/74% P16+HPV-, 100/100% P16-HPV+, and 82/66% P16-HPV-. 2/5-year PFS (p=0.013): 90/86% P16+HPV+, 81/76% P16+HPV-, 92/92% P16-HPV+, and 83/75% P16-HPV-. p16+HPV- clustered unfavorably with p16-HPV-, while p16-HPV+ clustered favorably with p16+HPV+. Number needed to harm calculations estimate that for every 10 p16+HPV- patients treated, one additional adverse outcome would be observed than would be anticipated with their p16+HPV+ counterparts. Conclusion: We provide the first study using NLP to classify data from head and neck cancer pathology reports, and the largest study to our knowledge of P16 and HPV discordant OPC outcomes in the United States. Pathology report standardization would improve NLP performance, and future work will compare these methods to large language models. p16 and HPV discordant tumors constitute a notable minority (16.5%) of patients and the inferior prognosis of p16+HPV- patients suggest that both tests may be of clinical utility—especially when considering patients for treatment de-escalation trials.