Dissimilarity Plots: A Visual Exploration Tool for Partitional Clustering

Michael Hahsler, Kurt Hornik

Publication: Working/Discussion PaperWU Working Paper

101 Downloads (Pure)

Abstract

For hierarchical clustering, dendrograms provide convenient and powerful visualization. Although many visualization methods have been suggested for partitional clustering, their usefulness deteriorates quickly with increasing dimensionality of the data and/or they fail to represent structure between and within clusters simultaneously. In this paper we extend (dissimilarity) matrix shading with several reordering steps based on seriation. Both methods, matrix shading and seriation, have been well-known for a long time. However, only recent algorithmic improvements allow to use seriation for larger problems. Furthermore, seriation is used in a novel stepwise process (within each cluster and between clusters) which leads to a visualization technique that is independent of the dimensionality of the data. A big advantage is that it presents the structure between clusters and the micro-structure within clusters in one concise plot. This not only allows for judging cluster quality but also makes mis-specification of the number of clusters apparent. We give a detailed discussion of the construction of dissimilarity plots and demonstrate their usefulness with several examples.
Original languageEnglish
DOIs
Publication statusPublished - 1 Sept 2009

Publication series

SeriesResearch Report Series / Department of Statistics and Mathematics
Number89

WU Working Paper Series

  • Research Report Series / Department of Statistics and Mathematics

Cite this