TY - JOUR
T1 - Distance analysis of large data sets of categorical variables using object weights
AU - Groenen, Patrick J.F.
AU - Commandeur, Jacques J.F.
AU - Meulman, Jacqueline J.
PY - 1998/11
Y1 - 1998/11
N2 - Categorical variables are often analysed by multiple correspondence (or homogeneity analysis), which places great emphasis on graphical representation. A drawback of this method is that sometimes only minor aspects of the data are displayed, or, if a dominant first dimension exists, the horseshoe effect occurs. Here we elaborate on a competing approach to multiple correspondence analysis based on distance approximation. This method emphasizes the distance between objects; they are graphically displayed as points, and objects close together are considered more similar than objects farther apart. A limiting factor of this method is that the number of objects cannot be very large (say, no more than 500). We show how the majorization algorithm for distance approximation can be extended using frequency counts as object weights such that much larger data sets can be analysed without a significant amount of additional computational effort. A second advantage of the use of object weights is that resampling methods, such as the bootstrap are easily implemented. We present two illustrative examples, and investigate the stability in one of them through the bootstrap.
AB - Categorical variables are often analysed by multiple correspondence (or homogeneity analysis), which places great emphasis on graphical representation. A drawback of this method is that sometimes only minor aspects of the data are displayed, or, if a dominant first dimension exists, the horseshoe effect occurs. Here we elaborate on a competing approach to multiple correspondence analysis based on distance approximation. This method emphasizes the distance between objects; they are graphically displayed as points, and objects close together are considered more similar than objects farther apart. A limiting factor of this method is that the number of objects cannot be very large (say, no more than 500). We show how the majorization algorithm for distance approximation can be extended using frequency counts as object weights such that much larger data sets can be analysed without a significant amount of additional computational effort. A second advantage of the use of object weights is that resampling methods, such as the bootstrap are easily implemented. We present two illustrative examples, and investigate the stability in one of them through the bootstrap.
UR - http://www.scopus.com/inward/record.url?scp=0009209173&partnerID=8YFLogxK
U2 - 10.1111/j.2044-8317.1998.tb00678.x
DO - 10.1111/j.2044-8317.1998.tb00678.x
M3 - Article
AN - SCOPUS:0009209173
SN - 0007-1102
VL - 51
SP - 217
EP - 232
JO - British Journal of Mathematical and Statistical Psychology
JF - British Journal of Mathematical and Statistical Psychology
IS - 2
ER -