Cluster differences scaling is a method for partitioning a set of objects into classes and simultaneously finding a low-dimensional spatial representation of K cluster points, to model a given square table of dissimilarities among n stimuli or objects. The least squares loss function of cluster differences scaling, originally defined only on the residuals of pairs of objects that are allocated to different clusters, is extended with a loss component for pairs that are allocated to the same cluster. It is shown that this extension makes the method equivalent to multidimensional scaling with cluster constraints on the coordinates. A decomposition of the sum of squared dissimilarities into contributions from several sources of variation is described, including the appropriate degrees of freedom for each source. After developing a convergent algorithm for fitting the cluster differences model, it is argued that the individual objects and the cluster locations can be jointly displayed in a configuration obtained as a by-product of the optimization. Finally, the paper introduces a fuzzy version of the loss function, which can be used in a successive approximation strategy for avoiding local minima. A simulation study demonstrates that this strategy significantly outperforms two other well-known initialization strategies, and that it has a success rate of 92 out of 100 in attaining the global minimum.