Abstract
Topic models allow the probabilistic modeling of term frequency occurrences in documents. The fitted model can be used to estimate the similarity between documents as well as between a set of specified keywords using an additional layer of latent variables which are referred to as topics. The R package topicmodels provides basic infrastructure for fitting topic models based on data structures from the text mining package tm. The package includes interfaces to two algorithms for fitting topic models: the variational expectation-maximization algorithm provided by David M. Blei and co-authors and an algorithm using Gibbs sampling by Xuan-Hieu Phan and co-authors.
Originalsprache | Englisch |
---|---|
Seiten (von - bis) | 1 - 30 |
Fachzeitschrift | Journal of Statistical Software |
Jahrgang | 40 |
Ausgabenummer | 13 |
Publikationsstatus | Veröffentlicht - 1 Juni 2011 |