A Deep Recurrent Neural Network Approach to Learn Sequence Similarities for User-Identification

Stefan Vamosi, Thomas Reutterer, Michael Platzer

Publication: Scientific journalJournal articlepeer-review

Abstract

The evolving digital economy entails multifaceted behavioral tracking data such as internet clickstreams, location trajectories, or taste preferences revealed by music or video streaming. Organizations are increasingly interested in using such data streams to profile customers based on their behavioral similarities for targeting purposes. However, measuring similarities in sequential data is a challenging task. We present a generic deep neural-network-based framework for quantifying the similarity of ordered sequences in observed event histories. This novel approach combines a specific type of recurrent neural nets with a triplet loss cost function used for network training. It yields an embedding space that serves as a similarity metric for complex sequential data, can handle multivariate sequential data and incorporate covariates. We empirically validate the derived similarity metric for user embeddings in the domain of re-identifying users in web browsing histories. We demonstrate its superior performance in discriminating users based on their behavioral browsing patterns by benchmarking against more conventional approaches to measure sequence similarity. In addition, we show that the methodology can be used for clustering sub-sequences and re-classifying users based on their observed clickstream behavior. Finally, we critically reflect benefits and possible downsides of the proposed framework, discuss extensions and promising future applications. An open-source reference implementation can be obtained from github.com/vamosi/tl_rnn.
Original languageEnglish
JournalDecision Support Systems (DSS)
Volume155
Issue number113718
DOIs
Publication statusPublished - 2022

Austrian Classification of Fields of Science and Technology (ÖFOS)

  • 502019 Marketing
  • 502052 Business administration
  • 502
  • 502020 Market research

Cite this