On the generation of correlated artificial binary data

Friedrich Leisch, Andreas Weingessel, Kurt Hornik

Publikation: Working/Discussion PaperWU Working Paper

520 Downloads (Pure)

Abstract

The generation of random variates from multivariate binary distributions has not gained as much interest in the literature as, e.g., multivariate normal or Poisson distributions. Binary variables are important in many types of applications. Our main interest is in the segmentation of marketing data, where data come from customer questionnaires with "yes/no" questions. Artificial data provide a valuable tool for the analysis of segmentation tools, because data with known structure can be constructed to mimic situations from the real world (Dolnicar et al. 1998). Questionnaire data can be highly correlated, when several questions covering the same field are likely to be answered similarly by a subject. In this paper we present a computationally fast method to simulate multivariate binary distributions with a given correlation structure. The implementation of the algorithm in R, an implementation of the S statistical language, is described in the appendix.

Publikationsreihe

ReiheWorking Papers SFB "Adaptive Information Systems and Modelling in Economics and Management Science"
Nummer13

WU Working Paper Reihe

  • Working Papers SFB \Adaptive Information Systems and Modelling in Economics and Management Science\

Zitat