Clustering of EORTC QLQ-C30 health-related quality of life scales across several cancer types: validation study

Introduction: The European Organisation for Research and Treatment of Cancer Quality of Life Core Questionnaire (EORTC QLQ-C30) measures 15 health-related quality of life (HRQoL) scales relevant to the disease and treatment of patients with cancer. A study by Martinelli (2011) demonstrated that these scales could be grouped into three main clusters: physical, psychological and gastrointestinal. This study aims to validate Martinelli’s ﬁndings in an independent dataset and evaluate whether these clusters are consistent across cancer types and patient characteristics. Methods: Pre-deﬁned criteria for successful validation were three main clusters should emerge with a minimum R-squared value of 0.51 using pooled baseline-data. A cluster analysis was performed on the 15 QLQ-C30 HRQoL-scales in the overall dataset, as well as by cancer type and selected patient characteristics to examine the robustness of the results. Results: The dataset consisted of 20,066 patients pooled across 17 cancer types. Overall, three main clusters were identiﬁed (R 2 Z 0.61); physical-cluster included role-functioning, physical-functioning, social-functioning, fatigue, pain, and global-health status; psychological-cluster included emotional-functioning, cognitive-functioning, and insomnia; gastro-intestinal-cluster included nausea/vomiting and appetite loss. The results were consistent across different levels of disease severity, socio-demographic and clinical characteristics with minor variations by cancer type. Global-health status was found to be strongly linked to the scales included in the physical-functioning-related cluster. Conclusion: This study successfully validated prior ﬁndings by Martinelli (2011): the QLQ-C30 scales are interrelated and can be grouped into three main clusters. Knowing how these multidimensional HRQoL scales are related to each other can help clinicians and patients with cancer in managing symptom burden, guide policymakers in deﬁning social-support plans and inform


Introduction
Treatment efficacy is usually the main goal in cancer clinical trials and is often measured in terms of patient survival.Almost every anti-cancer therapeutic strategy that has an intention to cure interferes with the integrity of the body in some way.Thus, patients with cancer often experience multiple symptoms resulting from associated treatments and the disease itself [1].These may affect the functioning and well-being of a patient resulting in poor quality of life.Even though survival end-points remain the most used primary end-points of interest in cancer clinical trials, health-related quality of life (HRQoL) is now increasingly considered as an important secondary or co-primary end-point for assessing clinical benefit of treatment [2e4].
HRQoL is a multidimensional concept that refers to the patient's subjective perception of the impact of the disease and treatments on the physical, psychological and social aspects of daily life [5].A comprehensive approach is required to design, analyze and interpret results [5e8].Due to the multi-dimensionality of HRQoL outcomes, it is likely that these outcomes are interrelated.Thus, it is informative to assess the existence of clusters so that individual symptoms or outcomes can act as indicators for co-occurring problems otherwise not detected [9].This will also aid in selecting outcomes of interest in assessing HRQoL in cancer clinical trials.
Furthermore, several studies have shown that cancer symptoms are inter-related and often occur in clusters [10e12].For instance, Walsh et al. (2006) identified seven clusters in the analysis of 25 symptoms assessed using a 38-symptom checklist in patients with advanced cancer using hierarchical cluster analysis [10].Chow et al. (2008) identified 3 symptom clusters at baseline in patients with brain metastases before and after radiotherapy indicating the robust existence of interrelationships between the symptoms [11].A literature review on symptom clusters in patients with cancer identified various clusters within the selected 7 studies [12].Furthermore, a study by Gundy et al. investigated the statistical fit of 6 higher order models for summarising QLQ-C30 HRQoL questionnaire using the confirmatory factor analysis and found that the physical/mental health model had the best fit [22].
It is worth noting that the characterisation of symptom clusters often focuses on patient symptoms and seldom incorporates other aspects of HRQoL that cover patients' functioning abilities, which are equally important in managing patients with cancer [15].HRQoL indicators, such as physical, emotional, social, cognitive and role functioning, have also been shown to be inter-related and to be correlated with various symptom scales (e.g.physical-functioning vs pain), as well as being predictive of survival in cancer clinical trials [13,14].This reiterates the need to have a more holistic picture of the interrelationships among the various HRQoL indicators, which will better inform our choices on effective patient management strategies.Martinelli et al. (2011) explored the way in which HRQoL scales, measured by the European Organisation for Research and Treatment of Cancer Quality of Life Core Questionnaire (EORTC QLQ-C30), cluster among patients with cancer and how possible clusters depend on different socio-demographic and clinical characteristics.The study also identified HRQoL scales that are related to patients' evaluation of their own overall quality of life as assessed by the global-health status scale of the QLQ-C30.The study demonstrated that the 15 HRQoL scales are inter-related and could be grouped into three main clusters.The same clusters were reproduced across different sociodemographic and clinical characteristics with minor variations among cancer types [15].
However, to increase the confidence in using these exploratory findings in clinical research, it is important to critically evaluate the robustness and generalisability of these findings with an independent dataset.This study aims to perform a validation of these findings in an independent dataset of patients treated on clinical trials and evaluate whether these clusters are consistent across different cancer types and other patient characteristics using a similar methodology.A secondary objective of this study is to find out which HRQoL scales are strongly linked to the global-health status that measures the overall HRQoL of a patient.

Data description
Published clinical trial data for this study were obtained from the European Organisation for Research and Treatment of Cancer (EORTC), Project Data Sphere [16], Mayo Clinic and Canadian Cancer Trials Group databases.Baseline data were pooled across 55 clinical trials that assessed HRQoL using the EORTC QLQ-C30 across 17 cancer types.None of these trials were previously used in the Martinelli et al. (2011) analyses.Patients' socio-demographic and clinical data of interest included gender, metastatic disease status, disease stage, WHO performance status (WHO PS), prior treatment status and patient's age.

The EORTC QLQ-C30
Patients' HRQoL was assessed using the EORTC QLQ-C30 version 3, which is one of the most widely used questionnaires for assessing the quality of life of patients with cancer.The reliability and validity of the QLQ-C30 are highly consistent across different language and cultural groups and the questionnaire has been translated into more than 110 different languages [17,18].The QLQ-C30 consists of 30 items which are grouped into five functional scales (physical, role, emotional, social and cognitive functioning), three symptom scales (fatigue, nausea/vomiting and pain), six single-item scales (dyspnea, insomnia, appetite loss, constipation, diarrhoea and financial difficulties) and one global-health status scale (GHS).The QLQ-C30 scales are scored according to a standard scoring manual [19], with the scores for each scale ranging from 0 to 100.For the functioning scales and GHS, higher scores represent a higher degree of functioning while the higher the score for symptom scales, the higher the level of symptom burden.

Statistical analysis
Patients' socio-demographic, clinical and HRQoL data at baseline were summarised using descriptive statistics.To explore associations between the 15 QLQ-C30 scales, Spearman-rank correlations were calculated.Following a similar approach as the one of Martinelli et al., a cluster analysis was performed on the 15 QLQ-C30 scales.Subgroup analyses for each cancer type and selected patient characteristics were also performed, to examine the robustness of the results.
Agglomerative hierarchical cluster analysis was performed to explore the existence of homogenous groups among the 15 HRQoL scales in the overall dataset comprising baseline data, pooled across all cancer types.Cluster analysis seeks to partition the observations into distinct groups so that observations within each group are quite similar to each other, while observations in different groups are quite different from each other [20].This technique assumes that each HRQoL scale is a cluster at the start, and then proceeds to merge the two most similar clusters and evaluate their similarity.This procedure is repeated in a hierarchical stepwise fashion until all scales are assembled into a single cluster.The similarity between various clusters was assessed via Ward's method which assumes that if two clusters are similar, then the between cluster sum of squares should be small.A tree-like representation of the clusters, a dendrogram, is produced for easier identification of the clusters.The earlier the cluster fusion on the dendrogram, the more similar the groups of observations are to each other [20].
The proportion of variance explained by the cluster, R 2 -value, was used to select the optimal number of clusters.The higher the R 2 -value, the higher the difference between clusters [20].Based on the results of Martinelli et al., pre-defined criteria for successful replication were set: three main clusters should emerge (physical, psychological and gastro-intestinal) with a minimum R 2 -value of 0.51.Internal consistency for each cluster was assessed using the Cronbach-a.Greater consistency is determined by higher values of the a-coefficient [21].SAS version 9.4 was used to carry out all analyses.

Data preparation
Baseline data from 24,658 patients were pooled from 55 closed randomised clinical trials.Of these, 3268 (13%) were excluded because of invalid baseline QoL forms, and 1324 (5%) were excluded because of missing QoL forms.A form was considered a valid baseline form if it was administered 2 weeks before or after randomisation, provided that it was collected before the start of treatment.Thus, the final analysis dataset consisted of 20,066 patients with complete baseline data (Fig. 1).

Descriptive results
Descriptive statistics for the clinical and sociodemographic patient characteristics collected at baseline are presented in Table 1.Of the 20,066 patients included in the analysis, 30% had metastatic disease and 47% had good WHO PS (Z0).Descriptive statistics for the 15 HRQoL scales are shown in Table 2 for the overall dataset.The average score for the GHS scale across all patients was 65 (SD Z 23).The worst average scores for the symptom scales were reported in fatigue (mean Z 33, SD Z 26).Patients reported the least impaired average symptom scores in diarrhoea (mean Z 8, SD Z 18).
Mean scores for HRQoL scales were also examined by patient characteristics (Table 2).The biggest difference in mean scores ranged from 10 to 21 points.These were observed between patients with good and poor WHO PS; specifically for role-functioning, the difference between good and poor WHO PS average score was 21 points.Patients with good WHO PS (Z0) reported higher scores on functional scales and very low scores on symptom scales compared to patients with performance status scores 1.Also, younger patients (60 years) reported a higher level of functioning and lower symptom scores than older patients.Patients with locally advanced and metastatic disease reported more impaired scores than early-stage diseased patients.Furthermore, patients who had received prior systemic treatment also reported more impaired scores than those who did not.
The average HRQoL scores were also compared across the different cancer types (Tables 3 and 4).On average, patients with melanoma reported higher scores on functioning scales and very low scores on symptoms scales, while patients with pancreatic cancer reported the most impaired scores in almost all the scales e worse than the other cancer types.Testicular and bladder patients reported higher average scores for pain.The strongest correlations were observed between fatigue and role-functioning (0.71) and between fatigue and physical-functioning (0.70).On the other hand, the Fig. 1.Flowchart (Study selection).Baseline data from 24,658 patients were pooled from 55 closed randomised clinical trials.13% (3268) were excluded because of invalid baseline QoL forms, and 5% (1324) were excluded because of missing QoL forms.A form was considered a valid baseline form if it was administered at 2 weeks before or after randomisation, provided that it was collected before start of treatment.Thus, the final analysis dataset consisted of 20,066 patients with complete baseline data.lowest correlations were observed between diarrhoea and constipation (0.04) (Table 5).

Main results
Results from cluster analysis performed in the overall dataset are summarised in Fig. 2. As shown in the dendrogram, the first two similar clusters to be merged were role-functioning and fatigue, followed by physicalfunctioning.Overall, seven clusters were identified for an R 2 -value of 0.61.The three main clusters were identified and mirrored those presented by Martinelli et al., namely physical functioning-related e includes rolefunctioning, physical-functioning, social-functioning, fatigue, pain and GHS (Cronbach's a Z 0.91); psychological functioning-related e includes emotionalfunctioning, cognitive-functioning and insomnia (Cronbach's a Z 0.68); and gastro-intestinal related e includes nausea/vomiting and appetite loss (Cronbach's a Z 0.63).GHS scale was found to be part of the physicalfunctioning related cluster in the overall dataset (Fig. 2).Constipation, dyspnoea, diarrhoea and financial problems were each included as separate single-scale clusters.
This result was consistent across different levels of disease severity, age, gender, prior-treatment status, metastatic disease status and WHO PS (appendix Figs.1e6).However, variations in the cluster structure were observed when looking at individual cancer types.All seven clusters including the three main clusters were reproduced in prostate, breast, gastric and sarcoma patient subgroups.In all the other cancer-type subgroups, the scales were mixed in different clusters but the cluster structure of the three main clusters was maintained (appendix Figs.7e11).

Discussion
This study aimed to evaluate the robustness and generalisability of the exploratory findings by Martinelli et al. [15].Given the current replication crisis in psychology and medical research, it is critical to validate the exploratory findings by Martinelli et al. who examined how the scales of the EORTC QLQ-C30 at baseline clustered among the treated patients with cancer.The study also checked whether the identified clusters were consistent across patients' clinical and sociodemographic characteristics, as well as across different cancer types.Our study successfully validated the key findings from the work of Martinelli, using independent data pooled from 55 trials that assessed HRQoL using the QLQ-C30.This implies that these findings remain consistent and provide support for the generalisability of these clusters across various cancer types.
The three main clusters originally identified were confirmed in our overall pooled dataset and were consistently observed across various subgroups.Diarrhoea was not included as part of the gastro-intestinal cluster.This was probably due to the low number of patients who experienced diarrhoea in the trials that were included in this study.This was in line with Martinelli's findings where no major differences in terms of cluster structure were found in most of the subgroups.
A secondary objective of this study was to find out which HRQoL scales were strongly linked to the global perception of GHS.GHS was found to be part of the physical-functioning related cluster in the overall dataset.The result was consistent across different levels of socio-demographic and clinical characteristics with minor differences by cancer type.This confirms that the HRQoL scales in the physical-functioning related cluster have a stronger link to the patient's perception of their overall quality-of-life compared to scales in other clusters.These findings were also observed by Martinelli et al., overall and by patient subgroups.
Results from this study are informative for clinical research.These findings allow us to have a better understanding of how these multidimensional HRQoL scales or outcomes are related to each other.If one scale is impacted, it is likely that another scale in the same cluster is also impacted.One of the three main clusters identified by Martinelli and validated in our project group included insomnia together with cognitive and emotional functioning.Therefore, sleeplessness may serve as a screening indicator for more structural underlying depression that is less easily elucidated.Rather than treating only the insomnia problem with medication, a more in-depth assessment of the patient emotional status could be advised.
These findings may also be relevant for clinical trial design.As QLQ-C30 scales from the same cluster have high intercorrelation, such scales should not be treated as independent outcomes.This is often the case currently when analyzing QLQ-C30 scales applying harsh multiplicity corrections.Applying a decision rule that would be based on pre-set conditions for scales within a cluster being fulfilled, may be more applicable and can result in a Table 2 reduced sample size.In addition, the identified clusters can help in the selection of scales from the QLQ-C30 as primary endpoints for a clinical trial.These findings may also aid clinicians and cancer patients to manage symptoms and symptom burden by understanding which patient problems are more likely to affect a patient's HRQoL [10].The focus will not only be in understanding individual patient symptoms but also understanding all the symptoms that occur together.
This study is a validation study and has some limitations.Missing values are a common problem in HRQoL research.We observed 5% incomplete data in the overall dataset.These were patients who did not fill in all the items in the HRQoL form.A complete case analysis strategy was used to handle missing data.The data used in this study were retrieved from clinical trial databases, where not all data may have been available due to data sharing restrictions (e.g., unknown results; 43.5% on disease stage, 33% treatment status, 27% metastatic status).Furthermore, the data used in this study originate from controlled clinical trials, each with specific patient selection and treatment criteria.This may restrict the generalisability of the observed findings to patients not covered by the included clinical trials.It also limits investigation into differences between the various disease sites with some having only a few trial data available (e.g.anal cancer (n Z 66) and endometrial cancer (n Z 101)).
Our study only assessed HRQoL clusters at baseline.However, it may be interesting to investigate if the observed results are consistent at different assessment time points.Selecting a uniform follow-up timepoint in a study like ours that pooled data from multiple studies with varying assessment schedules remains a challenge.Clusters were explored using hierarchical cluster analysis.Other methodologically stronger statistical techniques could be explored to support the findings of this Table 4 Mean and standard deviation of baseline HRQoL scale scores in the overall population and by disease site (continuation).study.For example, Gundy et al. used higher-order models for the QLQ-C30 HRQoL to compare the statistical fitness of the six alternative models using confirmatory factor analysis in a large sample of patients [22].However, this could be considered in the future as this is beyond the scope of this study.
In conclusion, our results confirm the tendency of certain HRQoL issues to occur together and validate the prior findings from Martinelli's study.Improving our understanding of how these multidimensional scales are related can help clinicians and patients with cancer to better manage symptom burden, guide policymakers in defining social support plans and inform the selection of HRQoL scales in future clinical trials.Fig. 2. Dendrogram overall dataset.AP, appetite loss; CF, cognitive functioning; CO, constipation; DI, diarrhea; DY, dyspnea; EF, emotional functioning; FA, fatigue; NV, nausea/vomiting; PA, pain; PF, Physical functioning, QL, global quality of life; RF, role functioning; SF, social functioning; SI, insomnia; FI, financial problems.The three main clusters were identified in the overall dataset i.e., physical-related, psychological-related, and gastro-intestinal cluster.For consistency in direction, before performing the cluster analysis, the functional scales were reversed to match the direction of the symptom scales so that a lower score represents a higher level of functioning.The scales that were consistent in these three main clusters includes (i) physical functioning, role functioning, fatigue, global quality of life, and painphysical-related cluster, (ii) emotional functioning, cognitive functioning, and insomniapsychological-related cluster and (iii) appetite loss and nausea/vomitinggastro-intestinal cluster.The remaining scales were mostly in single-item clusters.
Gastrointestinal Tract Cancer, Head and Neck Cancer, Genito-Urinary Cancers, and Gynecological Cancer Groups) and the principal investigators involved in the various EORTC trials.The authors are also grateful to Project Data Sphere, Mayo Clinic and Canadian Clinical Trials Group for contributing data to this project.Thanks to Paul Novotny for facilitating the preparation and transfer of data from Mayo Clinic.Finally, special thanks to all the patients who participated in the trials used in this study.

Table 1
Patient socio-demographic and clinical characteristics.