H. Eum1,2, J. Hwang3,4, J. W. Park2, S. Park2, J. Seong2, and J. S. Kim1,2; 1Medical Physics and Biomedical Engineering Laboratory (MPBEL), Yonsei University College of Medicine, Seoul, Korea, Republic of (South), 2Department of Radiation Oncology, Yonsei Cancer Center, Yonsei University College of Medicine, Seoul, Korea, Republic of (South), 3Department of Nuclear and Quantum Engineering, Korea Advanced Institute of Science and Technology(KAIST), Daejeon, Korea, Republic of (South), 4Medical Image and Radiotherapy Laboratory (MIRLAB), Korea Advanced Institute of Science and Technology, Daejeon, Korea, Republic of (South)
Purpose/Objective(s): The adoption of artificial intelligence (AI) for auto-segmentation has shown promise in achieving high-quality organ at risk (OAR) contours. However, the challenge lies in producing accurate tumor contours due to their inherent heterogeneity. This study investigates the influence of liver tumor heterogeneity in abdominal computed tomography (CT) scans on the performance of advanced deep-learning networks. Materials/
Methods: In this study, we analyzed two datasets, each comprising 131 patients with liver tumors: the Liver Tumor Segmentation Challenge 2017 (LiTS) dataset and the dataset from a South Korean hospital. Each dataset inherently exhibits unique characteristics in terms of tumor diversity, tumor dispersion, and the tumor-to-liver ratio. To evaluate the impact of tumor heterogeneity on the performance of deep learning networks in abdominal CT liver tumor segmentation, we selected models with an ascending order of inductive bias: nnU-Net, Attention Unet, Swin UNETR, and UNETR. Performance metrics included the Dice Similarity Coefficient (DSC), Intersection over Union (IoU), and Average Surface Distance (ASD). Results: (TBD) The LiTS dataset, possesses its diverse and scattered tumors with a tumor-to-liver ratio of 0.046 (± 0.085), indicating less concentrated tumor regions. Conversely, the hospital dataset featured post-primary radiation therapy Hepatocellular Carcinoma (HCC) patients with more clustered tumors and a significantly higher tumor-to-liver ratio of 0.459 (± 0.478), a difference validated by a statistical p-value of . In the LiTS dataset, Attention Unet exhibited the highest DSC at 0.792, closely followed by nnU-Net with a DSC of 0.784. This performance illustrates the strength of attention-based models in tasks requiring the detection of diverse and scattered tumors, due to their capability to effectively learn global semantics. In the hospital dataset, conversely, nnU-Net emerged as the leading model achieving a DSC of 0.816, while Swin UNETR recorded a DSC of 0.757. This outcome demonstrates the efficacy of convolution-based models in tasks where learning the details of tumor boundaries is crucial, as they excel in capturing local semantics. Conclusion: (TBD) Our research emphasizes the significant impact of tumor heterogeneity on the performance of deep learning models for liver tumor segmentation in abdominal CT images. It reveals a direct correlation between model architecture and tumor traits: attention-based models excel in segmenting dispersed tumors by effectively minimizing false positives, whereas convolution-based models are more accurate in identifying true positives in dense tumor scenarios such as HCC. This distinction highlights the importance of selecting AI models based on specific tumor characteristics, challenging the efficacy of a one-size-fits-all approach in tumor segmentation.