Voting in clustering and finding the number of clusters

Evgenia Dimitriadou, Andreas Weingessel, Kurt Hornik

Publikation: Working/Discussion PaperWU Working Paper

79 Downloads (Pure)

Abstract

In this paper we present an unsupervised algorithm which performs clustering given a data set and which can also find the number of clusters existing in it. This algorithm consists of two techniques. The first, the voting technique, allows us to combine several runs of clustering algorithms, with the number of clusters predefined, resulting in a common partition. We introduce the idea that there are cases where an input point has a structure with a certain degree of confidence and may belong to more than one cluster with a certain degree of "belongingness". The second part consists of an index measure which receives the results of every voting process for diffrent number of clusters and makes the decision in favor of one. This algorithm is a complete clustering scheme which can be applied to any clustering method and to any type of data set. Moreover, it helps us to overcome instabilities of the clustering algorithms and to improve the ability of a clustering algorithm to find structures in a data set.

Publikationsreihe

ReiheReport Series SFB "Adaptive Information Systems and Modelling in Economics and Management Science"
Nummer30

WU Working Paper Reihe

  • Report Series SFB \Adaptive Information Systems and Modelling in Economics and Management Science\

Zitat