Memorial Sloan Kettering Cancer Center Uniondale, NY
Y. FU1, H. Zhang2, W. Cai1, L. Kuo3, H. Xie1, J. J. Cuaron1, L. I. Cervino3, J. M. Moran3, X. Li1, and T. Li4; 1Memorial Sloan Kettering Cancer Center, New York, NY, 2Memorial Sloan Kettering Cancer Center, NewYork, NY, 3Department of Medical Physics, Memorial Sloan Kettering Cancer Center, New York, NY, 4Memorial Sloan Kettering Cancer Center, NEW YORK, NY
Purpose/Objective(s): For breath-hold Cone Beam CT (CBCT) imaging on a conventional linear accelerator (LINAC), multiple breath holds are typically required to acquire complete projections for faithful image reconstruction. The long acquisition leads to motion artifacts on the CBCT images and reduces patient comfort during the treatment. This study introduces a new diffusion-based image reconstruction method for ultra-fast short-arc CBCT acquisition. Materials/
Methods: To facilitate short-arc image reconstruction, we trained a patient-specific Denoising Diffusion Probabilistic Model (DDPM) model using the patient’s 4DCT axial images, augmented by image shifting in the anterior-posterior and lateral directions. To expedite the model training process, the images were downsampled to 256x256 resolution. Training the model required 2 days on 2 Nvidia Quadro RTX 8000 GPUs with 48GB of memory each. We employed a linear noise schedule with 1000 diffusion steps in the DDPM. Utilizing the trained model as prior knowledge, we applied it in short-arc CBCT reconstructions with 90-degree and 45-degree limited angle data acquisition. A conjugate-gradient (CG)-guided diffusing sampling scheme was proposed to initiate the diffusion-based CBCT reconstruction process from a middle step using noise-blended CG reconstructed images. Our patient-specific model aims to minimize hallucination, while the CG-guided backward diffusion sampling scheme expedites the reconstruction process. For validation, we performed a simulation to reconstruct the patient’s free breathing CT under cone beam geometry, aided by a patient-specific diffusion model trained by the patient’s 4DCT images. We assessed the reconstructed image quality using Peak Signal-to-Noise Ratio (PSNR) and Structural Similarity Index (SSIM) metrics, comparing against the conventional iterative CG least squares (CGLS) methods. Results: The CBCT acquisition can be completed within a breath-hold with times of 15 seconds for the 90-degree short-arc and 7.5 seconds for the 45-degree short-arc. Compared to the traditional iterative CGLS reconstruction, PSNR improved from 22.1 to 26.6, SSIM from 0.87 to 0.89, and MAE decreased from 35.0 to 18.3 for 90-degree reconstruction and PSNR improved from 18.2 to 22.7, SSIM from 0.83 to 0.85, and MAE decreased from 53.5 to 31.1 for 45-degree reconstruction. With this method, the CBCT reconstruction of image size 256x256x110 with a voxel size of 2x2x1.25 mm took approximately 3 minutes using 2 Nvidia Quadro RTX 8000 GPUs with 48GB of memory. Conclusion: Leveraging patient-specific prior knowledge through diffusion models enables ultra-fast short-arc CBCT image acquisition, enhancing image quality, reducing motion artifacts, and increasing patient comfort during treatment.