Multivariate Weibull mixtures with proportional hazard restrictions for dwell time based session clustering with incomplete data

Patrick Mair, Marcus Hudec

Publication: Scientific journalJournal articlepeer-review


Emanating from classical Weibull mixture models we propose a framework for clustering survival data with various more parsimonious models by imposing restrictions on the distributional parameters. We show that these restrictions on the Weibull mixtures correspond to different proportional hazard restrictions across mixture components and Web page areas. A parametric cluster approach based on the EM algorithm is carried out on a multivariate data set. Our model set-up encompasses incomplete-data structures as well as censoring observations. We apply the methodology on retail data stemming from a global e-commerce company. Sessions are clustered with respect to the dwell times that a user spends on certain page areas. The cluster solution that is found allows for a detailed examination of the navigation behaviour in terms of the hazard and survivor functions within each component
DOI: 10.1111/j.1467-9876.2009.00665.x
Original languageEnglish
Pages (from-to)619 - 639
JournalJournal of the Royal Statistical Society: Series C (Applied Statistics)
Issue number5
Publication statusPublished - 1 Oct 2009

Cite this