TMTV-Net: fully automated total metabolic tumor volume segmentation in lymphoma PET/CT images — a multi-center generalizability analysis

Fereshteh Yousefirizi,Ivan S. Klyuzhin,Joo Hyun O,Sara Harsini,Xin Tie,Isaac Shiri, Muheon Shin, Changhee Lee, Steve Y. Cho,Tyler J. Bradshaw,Habib Zaidi,François Bénard,Laurie H. Sehn,Kerry J. Savage,Christian Steidl,Carlos F. Uribe,Arman Rahmim

European Journal of Nuclear Medicine and Molecular Imaging（2024）

引用 0|浏览4

暂无评分

摘要

Purpose Total metabolic tumor volume (TMTV) segmentation has significant value enabling quantitative imaging biomarkers for lymphoma management. In this work, we tackle the challenging task of automated tumor delineation in lymphoma from PET/CT scans using a cascaded approach. Methods Our study included 1418 2-[ 18 F]FDG PET/CT scans from four different centers. The dataset was divided into 900 scans for development/validation/testing phases and 518 for multi-center external testing. The former consisted of 450 lymphoma, lung cancer, and melanoma scans, along with 450 negative scans, while the latter consisted of lymphoma patients from different centers with diffuse large B cell, primary mediastinal large B cell, and classic Hodgkin lymphoma cases. Our approach involves resampling PET/CT images into different voxel sizes in the first step, followed by training multi-resolution 3D U-Nets on each resampled dataset using a fivefold cross-validation scheme. The models trained on different data splits were ensemble. After applying soft voting to the predicted masks, in the second step, we input the probability-averaged predictions, along with the input imaging data, into another 3D U-Net. Models were trained with semi-supervised loss. We additionally considered the effectiveness of using test time augmentation (TTA) to improve the segmentation performance after training. In addition to quantitative analysis including Dice score (DSC) and TMTV comparisons, the qualitative evaluation was also conducted by nuclear medicine physicians. Results Our cascaded soft-voting guided approach resulted in performance with an average DSC of 0.68 ± 0.12 for the internal test data from developmental dataset, and an average DSC of 0.66 ± 0.18 on the multi-site external data ( n = 518), significantly outperforming ( p < 0.001) state-of-the-art (SOTA) approaches including nnU-Net and SWIN UNETR. While TTA yielded enhanced performance gains for some of the comparator methods, its impact on our cascaded approach was found to be negligible (DSC: 0.66 ± 0.16). Our approach reliably quantified TMTV, with a correlation of 0.89 with the ground truth ( p < 0.001). Furthermore, in terms of visual assessment, concordance between quantitative evaluations and clinician feedback was observed in the majority of cases. The average relative error (ARE) and the absolute error (AE) in TMTV prediction on external multi-centric dataset were ARE = 0.43 ± 0.54 and AE = 157.32 ± 378.12 (mL) for all the external test data ( n = 518), and ARE = 0.30 ± 0.22 and AE = 82.05 ± 99.78 (mL) when the 10% outliers ( n = 53) were excluded. Conclusion TMTV-Net demonstrates strong performance and generalizability in TMTV segmentation across multi-site external datasets, encompassing various lymphoma subtypes. A negligible reduction of 2% in overall performance during testing on external data highlights robust model generalizability across different centers and cancer types, likely attributable to its training with resampled inputs. Our model is publicly available, allowing easy multi-site evaluation and generalizability analysis on datasets from different institutions.

查看译文

关键词

Total metabolic tumor volume, Generalizability, Artificial intelligence,Deep learning,3D U-Net,PET/CT,Lymphoma

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要