Constructing finite-context sources from fractal representations of symbolic sequences

Peter Tino, Georg Dorffner

Publikation: Working/Discussion PaperWU Working Paper

16 Downloads (Pure)


We propose a novel approach to constructing predictive models on long complex symbolic sequences. The models are constructed by first transforming the training sequence n-block structure into a spatial structure of points in a unit hypercube. The transformation between the symbolic and Euclidean spaces embodies a natural smoothness assumption (n-blocks with long common suffices are likely to produce similar continuations) in that the longer is the common suffix shared by any two n-blocks, the closer lie their point representations. Finding a set of prediction contexts is then formulated as a resource allocation problem solved by vector quantizing the spatial representation of the training sequence n-block structure. Our predictive models are similar in spirit to variable memory length Markov models (VLMMs). We compare the proposed models with both the classical and variable memory length Markov models on two chaotic symbolic sequences with different levels of subsequence distribution structure. Our models have equal or better modeling performance, yet, their construction is more intuitive (unlike in VLMMs, we have a clear idea about the size of the model under construction) and easier to automize (construction of our models can be done in a completely self-organized manner, which is shown to be problematic in the case of VLMMs). (author's abstract)


ReiheWorking Papers SFB "Adaptive Information Systems and Modelling in Economics and Management Science"

WU Working Paper Reihe

  • Working Papers SFB \Adaptive Information Systems and Modelling in Economics and Management Science\