Interobserver agreement in automated metabolic tumor volume measurements of Deauville score 4 and 5 lesions at interim 18F-FDG PET in DLBCL

Gerben J.C. Zwezerijnen, Jakoba J. Eertink, Coreline N. Burggraaff, Sanne E. Wiegers, Ekhlas A.I.N. Shaban, Simone Pieplenbosch, Daniela E. Oprea-Lager, Pieternella J. Lugtenburg, Otto S. Hoekstra, Henrica C.W. de Vet, Josee M. Zijlstra, Ronald Boellaard*

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review

8 Citations (Scopus)


Metabolic tumor volume (MTV) on interim-PET (I-PET) is a potential prognostic biomarker for diffuse large B-cell lymphoma (DLBCL). Implementation of MTV on I-PET requires consensus which semi-automated segmentation method delineates lesions most successfully with least user interaction. Methods used for baseline PET are not necessarily optimal for I-PET due to lower lesional standardized uptake values (SUV) at I-PET. Therefore, we aimed to evaluate which method provides the best delineation quality of Deauville-score (DS) 4-5 DLBCL lesions on I-PET at best interobserver agreement on delineation quality and, secondly, to assess the effect of lesional SUVmax on delineation quality and performance agreements. Methods: DS4-5 lesions from 45 I-PET scans were delineated using six semi-automated methods i) SUV 2.5, ii) SUV 4.0, iii) adaptive threshold [A50%peak], iv) 41% of maximum SUV [41%max], v) majority vote including voxels detected by ≥2 methods [MV2] and vi) detected by ≥3 methods [MV3]. Delineation quality per MTV was rated by three independent observers as acceptable or non-acceptable. For each method, observer scores on delineation quality, specific agreements and MTV were assessed for all lesions, and per category of lesional SUVmax (<5, 5-10, >10). Results: In 60 DS4-5 lesions on I-PET, MV3 performed best, with acceptable delineation in 90% of lesions, with a positive agreement (PA) of 93%. Delineation quality scores and agreements per method strongly depended on lesional SUV: the best delineation quality scores were obtained using MV3 in lesions with SUVmax<10 and SUV4.0 in more FDG-avid lesions. Consequently, overall delineation quality and PA improved by applying the most preferred method per SUV category instead of using MV3 as single best method. MV3- and SUV4.0-derived MTVs of lesions with SUVmax>10, were comparable after excluding visually failed MV3 contouring. For lesions with SUVmax<10, MTVs using different methods correlated poorly. Conclusion: On I-PET, MV3 performed best and provided the highest interobserver agreement regarding acceptable delineations of DS4-5 DLBCL lesions. However, delineation method preference strongly depended on lesional SUV. Therefore, we suggest to explore an approach that identifies the optimal delineation method per lesion as function of tumor FDG uptake characteristics, i.e. SUVmax.

Original languageEnglish
JournalJournal of Nuclear Medicine
Issue number11
Publication statusPublished - 1 Nov 2021

Bibliographical note

Funding Information:
Department of Radiology and Nuclear Medicine Amsterdam UMC, Vrije Universiteit Amsterdam, Cancer Center Amsterdam De Boelelaan 1117 1081HV Amsterdam, Netherlands +31(0)2044449638 Word count: 4999 Financial support: PETRA and RADIOMICS studies are supported by Alpe d'HuZes/KWF (Dutch Cancer Society; #VU2012 ? 5848 and #VU2018 ?11648 ). Running title: MTV measurements on interim ?PE DTLBCL

Publisher Copyright:
© 2021 Society of Nuclear Medicine Inc.. All rights reserved.


Dive into the research topics of 'Interobserver agreement in automated metabolic tumor volume measurements of Deauville score 4 and 5 lesions at interim 18F-FDG PET in DLBCL'. Together they form a unique fingerprint.

Cite this