A comparison of several cluster algorithms on artificial binary data [Part 1]. Scenarios from travel market segmentation [Part 2: Working Paper 19].

Sara Dolnicar, Friedrich Leisch, Andreas Weingessel, Christian Buchta, Evgenia Dimitriadou

Publikation: Working/Discussion PaperWU Working Paper

24 Downloads (Pure)

Abstract

Social scientists confronted with the problem of segmenting individuals into plausible subgroups usually encounter two main problems: First: there is very little indication about the correct choice of the number of clusters to search for. Second: different cluster algorithms and even multiple replications of the same algorithm result in different solutions due to random initializations and stochastic learning methods. In the worst case numerous solutions are found which all seem plausible as far as interpretation is concerned. The consequence is, that in the end clusters are postulated that are in fact "chosen" by the researcher, as he or she makes decisions on the number of clusters and the solution chosen as the "final" one. In this paper we concentrate on the power and stability of several popular clustering algorithms under the condition that the correct number of clusters is known. Artificial data sets modeled to mimic typical situations from tourism marketing are constructed. The structure of these data sets is described in several scenarios, and artificial binary data are generated accordingly. These data, ranging from very simple to more complex, real-data-like structures, enable us to systematically analyze the "behavior" of the cluster methods. Section 3 gives an overview of all cluster methods under investigation. Section 4 describes our experimental results, comparing first all scenarios and then all cluster methods. To accomplish this task, several evaluation criteria for cluster methods are proposed. Finally: Sections 5 and 6 draw some conclusions and give an outlook on future research. (author's abstract)

Publikationsreihe

ReiheWorking Papers SFB "Adaptive Information Systems and Modelling in Economics and Management Science"
Nummer7

WU Working Paper Reihe

  • Working Papers SFB \Adaptive Information Systems and Modelling in Economics and Management Science\

Zitat