Performance of five automated white matter hyperintensity segmentation methods in a multicenter dataset

Rutger Heinen*, Martijn D. Steenwijk, TRACE-VCI Study Grp, Frederik Barkhof, J. Matthijs Biesbroek, Wiesje M. van der Flier, H. J. Kuijf, N. D. Prins, Hugo Vrenken, Geert Jan Biessels, Jeroen de Bresser, E. van den Berg, J. M. F. Boomsma, L. G. Exalto, D. A. Ferro, C. J. M. Frijns, O. N. Groeneveld, N. M. van Kalsbeek, J. H. Verwer, J. de BresserH. J. Kuijf, M. E. Emmelot-Vonk, H. L. Koek, M. R. Benedictus, J. Bremer, A. E. Leeuwis, J. Leijenaar, N. D. Prins, P. Scheltens, B. M. Tijms, M. P. Wattjes, C. E. Teunissen, T. Koene, J. M. F. Boomsma, H. C. Weinstein, M. Hamaker, R. Faaij, M. Pleizier, M. Prins, E. Vriens

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review

34 Citations (Scopus)
13 Downloads (Pure)


White matter hyperintensities (WMHs) are a common manifestation of cerebral small vessel disease, that is increasingly studied with large, pooled multicenter datasets. This data pooling increases statistical power, but poses challenges for automated WMH segmentation. Although there is extensive literature on the evaluation of automated WMH segmentation methods, such evaluations in a multicenter setting are lacking. We performed WMH segmentations in sixty patients scanned on six different magnetic resonance imaging (MRI) scanners (10 patients per scanner) using five freely available and fully-automated WMH segmentation methods (Cascade, kNN-TTP, Lesion-TOADS, LST-LGA and LST-LPA). Different MRI scanner vendors and field strengths were included. We compared these automated WMH segmentations with manual WMH segmentations as a reference. Performance of each method both within and across scanners was assessed using spatial and volumetric correspondence with the reference segmentations by Dice's similarity coefficient (DSC) and intra-class correlation coefficient (ICC) respectively. We found the best performance, both within and across scanners, for kNN-TTP, followed by LST-LPA and LST-LGA, with worse performance for Lesion-TOADS and Cascade. Our findings can serve as a guide for choosing a method and also highlight the importance to further improve and evaluate consistency of methods in a multicenter setting.

Original languageEnglish
Article number16742
Number of pages12
JournalScientific Reports
Publication statusPublished - 14 Nov 2019

Bibliographical note

N.P.A. Zuithoff, assistant professor in Biostatistic Research for his help in the statistical analyses. The TRACE-VCI study is supported by Vidi grant 91711384 and Vici grant 91816616 from ZonMw, The Netherlands, Organisation for Health Research and Development and grant 2010T073 from the Dutch Heart Association to Geert Jan Biessels. Research of the VUMC Alzheimer Center is part of the neurodegeneration research program of the Neuroscience Campus Amsterdam. The VUMC Alzheimer Center is supported by Stichting Alzheimer Nederland and Stichting VUMC fonds. F.B. is supported by the NIHR UCLH biomedical research center.


Dive into the research topics of 'Performance of five automated white matter hyperintensity segmentation methods in a multicenter dataset'. Together they form a unique fingerprint.

Cite this