PQA 09 - PQA 09 Hematologic Malignancies and Digital Health Innovations Poster Q&A
3406 - Deep Learning with Biopsy Whole Slide Images for Predicting Pathological Complete Response to Neoadjuvant Immunochemotherapy in Esophageal Squamous Cell Carcinoma
X. Liu1, Y. Yang2, Y. Yi2, Z. R. Li1, Q. Zhou1, B. Li2, and C. Zhao3; 1Manteia Technologies Co.,Ltd, Xiamen, Fujian, China, 2Department of Radiation Oncology, Shandong Cancer Hospital and Institute, Shandong First Medical University and Shandong Academy of Medical Sciences, Jinan, Shandong, China, 3Manteia Technologies, Xiamen, Fujian, China
Purpose/Objective(s): Neoadjuvant immunochemotherapy (nICT) has been reported to enhance the prognosis of locally advanced esophageal squamous cell carcinoma (ESCC). However, the effectiveness of nICT may vary among individual patients and identifying those most likely to benefit remains a challenge. Thus, accurate prediction of pathological complete response (pCR) in patients undergoing neoadjuvant therapy is imperative for tailoring personalized treatment. Whole slide images (WSIs) offer crucial histopathological insights that can correlate with patients prognosis and response to treatment. However, the acquisition, processing, and modeling of WSIs present numerous technical challenges, including their megapixel size and the lack of pixel-level annotations, which hinders the development of WSI-based biomarkers for predicting treatment response. In this study, we propose a deep learning (DL) framework tailored to accurately predict pCR based on pretreatment WSIs and evaluate its efficacy in ESCC patients undergoing nICT. Materials/
Methods: The proposed DL framework tackles several key challenges in modeling WSIs. Firstly, we implement H&E normalization to mitigate inconsistencies in histology slide preparation. Subsequently, Otsu segmentation is employed to automatically delineate foreground pixels and eliminate background noise, thereby significantly reducing input image size and consequently mitigating overfitting in the modeling process. Following this, the foreground region is partitioned into small 256x256 patches to facilitate feature extraction using a neural network pretrained on a substantial number of WSI slides, thus further overcoming the challenge of handling extensive image sizes in WSI modeling. To further address the issue of model overfitting resulting from an abundance of patches per WSI, Kmeans clustering is conducted based on patch features, retaining a fraction (approximately 10%) of representative patches for each cluster. The resulting subset of patch features of each WSI is then fed into a novel Transformer-based neural network to effectively process both morphological and spatial information among patches and generate the final prediction of pCR. Results: This study enrolled 147 patients with locally advanced ESCC who underwent esophagectomy post nICT. Patients were randomly partitioned to training, validation, and testing cohorts at a ratio of 7:1.5:1.5. The proposed DL framework utilizing pretreatment WSIs demonstrated promising predictive performance, achieving an area under the curve (AUC) of 0.84 for the validation cohort and 0.80 for the testing cohort. Conclusion: To our knowledge, this study is the first to establish WSIs as predictive biomarkers for the response to nICT in ESCC. The proposed DL framework addresses several critical challenges encountered in modeling WSIs and demonstrates promising predictive performance, thereby enhancing the planning of ESCC treatment strategies.