TY - JOUR
T1 - Multicollinearity and redundancy of the PET radiomic feature set
AU - Noortman, Wyanne A.
AU - Vriens, Dennis
AU - Bussink, Johan
AU - Meijer, Tineke W.H.
AU - Aarntzen, Erik H.J.G.
AU - Deroose, Christophe M.
AU - Lhommel, Renaud
AU - Aide, Nicolas
AU - Le Tourneau, Christophe
AU - de Koster, Elizabeth J.
AU - Oyen, Wim J.G.
AU - Triemstra, Lianne
AU - Ruurda, Jelle P.
AU - Vegt, Erik
AU - de Geus-Oei, Lioe Fee
AU - van Velden, Floris H.P.
N1 - Publisher Copyright:
© The Author(s) 2025.
PY - 2025/11
Y1 - 2025/11
N2 - Introduction: The aim of this study was to map multicollinearity of the radiomic feature set in five independent [18F]FDG-PET cohorts with different tumour types and identify generalizable non-redundant features. Methods: Five [18F]FDG-PET radiomic cohorts were analysed: non-small cell lung carcinomas (N = 35), pheochromocytomas and paragangliomas (N = 40), head and neck squamous cell carcinomas (N = 54), [18F]FDG-positive thyroid nodules with indeterminate cytology (N = 84), and gastric carcinomas (N = 206). Lesions were delineated, and 105 radiomic features were extracted using PyRradiomics. In every cohort, Spearman’s rank correlation coefficient (ρ) matrices of features were calculated to determine which features showed (very) strong (ρ > 0.7 and ρ > 0.9) correlations with any other feature in all five cohorts. Cluster analysis of an averaged correlation matrix for all cohorts was performed at a threshold of ρ = 0.7 and ρ = 0.9. For each cluster, a representative, non-redundant feature was selected. Results: Seventy-two and 90 out of 105 features showed a (very) strong correlation with another feature in the correlation matrix in all five cohorts. Cluster analysis resulted in 35 and 15 non-redundant features at thresholds of ρ = 0.9 and ρ = 0.7, including 6 and 3 shape features, 4 and 2 intensity features, and 25 and 10 texture features, respectively. Seventy or 90 redundant features could be omitted at these thresholds, respectively. Conclusion: At least two-thirds of the radiomic feature set could be omitted because of strong multicollinearity in multiple independent cohorts. More redundant features could be identified using a less conservative threshold. Future research should indicate whether multicollinearity of the radiomic feature set is similar for other radiopharmaceuticals and imaging modalities. Key Points: Question Radiomic feature sets contain many strongly correlating features, which results in statistical challenges. Findings Analysis of the correlation matrices showed that the same radiomic features were strongly correlated in five independent [18F]FDG-PET cohorts with different tumour types. Clinical relevance At least two-thirds of the radiomic feature set could be omitted, because of strong multicollinearity. More redundant features could be identified using a less conservative threshold.
AB - Introduction: The aim of this study was to map multicollinearity of the radiomic feature set in five independent [18F]FDG-PET cohorts with different tumour types and identify generalizable non-redundant features. Methods: Five [18F]FDG-PET radiomic cohorts were analysed: non-small cell lung carcinomas (N = 35), pheochromocytomas and paragangliomas (N = 40), head and neck squamous cell carcinomas (N = 54), [18F]FDG-positive thyroid nodules with indeterminate cytology (N = 84), and gastric carcinomas (N = 206). Lesions were delineated, and 105 radiomic features were extracted using PyRradiomics. In every cohort, Spearman’s rank correlation coefficient (ρ) matrices of features were calculated to determine which features showed (very) strong (ρ > 0.7 and ρ > 0.9) correlations with any other feature in all five cohorts. Cluster analysis of an averaged correlation matrix for all cohorts was performed at a threshold of ρ = 0.7 and ρ = 0.9. For each cluster, a representative, non-redundant feature was selected. Results: Seventy-two and 90 out of 105 features showed a (very) strong correlation with another feature in the correlation matrix in all five cohorts. Cluster analysis resulted in 35 and 15 non-redundant features at thresholds of ρ = 0.9 and ρ = 0.7, including 6 and 3 shape features, 4 and 2 intensity features, and 25 and 10 texture features, respectively. Seventy or 90 redundant features could be omitted at these thresholds, respectively. Conclusion: At least two-thirds of the radiomic feature set could be omitted because of strong multicollinearity in multiple independent cohorts. More redundant features could be identified using a less conservative threshold. Future research should indicate whether multicollinearity of the radiomic feature set is similar for other radiopharmaceuticals and imaging modalities. Key Points: Question Radiomic feature sets contain many strongly correlating features, which results in statistical challenges. Findings Analysis of the correlation matrices showed that the same radiomic features were strongly correlated in five independent [18F]FDG-PET cohorts with different tumour types. Clinical relevance At least two-thirds of the radiomic feature set could be omitted, because of strong multicollinearity. More redundant features could be identified using a less conservative threshold.
UR - https://www.scopus.com/pages/publications/105004691084
U2 - 10.1007/s00330-025-11637-7
DO - 10.1007/s00330-025-11637-7
M3 - Article
C2 - 40332568
AN - SCOPUS:105004691084
SN - 0938-7994
VL - 35
SP - 6905
EP - 6916
JO - European Radiology
JF - European Radiology
IS - 11
ER -