Differences Between MR Brain Region Segmentation Methods: Impact on Single-Subject Analysis

W. Huizinga*, D. H.J. Poot, E. J. Vinke, F. Wenzel, E. E. Bron, N. Toussaint, C. Ledig, H. Vrooman, M. A. Ikram, W. J. Niessen, M. W. Vernooij, S. Klein

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review

1 Citation (Scopus)
30 Downloads (Pure)


For the segmentation of magnetic resonance brain images into anatomical regions, numerous fully automated methods have been proposed and compared to reference segmentations obtained manually. However, systematic differences might exist between the resulting segmentations, depending on the segmentation method and underlying brain atlas. This potentially results in sensitivity differences to disease and can further complicate the comparison of individual patients to normative data. In this study, we aim to answer two research questions: 1) to what extent are methods interchangeable, as long as the same method is being used for computing normative volume distributions and patient-specific volumes? and 2) can different methods be used for computing normative volume distributions and assessing patient-specific volumes? To answer these questions, we compared volumes of six brain regions calculated by five state-of-the-art segmentation methods: Erasmus MC (EMC), FreeSurfer (FS), geodesic information flows (GIF), multi-atlas label propagation with expectation–maximization (MALP-EM), and model-based brain segmentation (MBS). We applied the methods on 988 non-demented (ND) subjects and computed the correlation (PCC-v) and absolute agreement (ICC-v) on the volumes. For most regions, the PCC-v was good ((Formula presented.)), indicating that volume differences between methods in ND subjects are mainly due to systematic differences. The ICC-v was generally lower, especially for the smaller regions, indicating that it is essential that the same method is used to generate normative and patient data. To evaluate the impact on single-subject analysis, we also applied the methods to 42 patients with Alzheimer’s disease (AD). In the case where the normative distributions and the patient-specific volumes were calculated by the same method, the patient’s distance to the normative distribution was assessed with the z-score. We determined the diagnostic value of this z-score, which showed to be consistent across methods. The absolute agreement on the AD patients’ z-scores was high for regions of thalamus and putamen. This is encouraging as it indicates that the studied methods are interchangeable for these regions. For regions such as the hippocampus, amygdala, caudate nucleus and accumbens, and globus pallidus, not all method combinations showed a high ICC-z. Whether two methods are indeed interchangeable should be confirmed for the specific application and dataset of interest.

Original languageEnglish
Article number577164
JournalFrontiers in Big Data
Publication statusPublished - 30 Jul 2021

Bibliographical note

Funding Information:
The research leading to these results has received funding from the European Union Seventh Framework Programme FP7/2007 - 2013, project VPH-DARE@IT (Grant Agreement No: 601055) and from the European Union’s Horizon 2020 research and innovation programme, project EuroPOND (Grant Agreement No: 666992).

Publisher Copyright:
© Copyright © 2021 Huizinga, Poot, Vinke, Wenzel, Bron, Toussaint, Ledig, Vrooman, Ikram, Niessen, Vernooij and Klein.


Dive into the research topics of 'Differences Between MR Brain Region Segmentation Methods: Impact on Single-Subject Analysis'. Together they form a unique fingerprint.

Cite this