Biterm Topic Models for Short Text

Biterm Topic Models find topics in collections of short texts. It is a word co-occurrence based topic model that learns topics by modeling word-word co-occurrences patterns which are called biterms. This in contrast to traditional topic models like Latent Dirichlet Allocation and Probabilistic Latent Semantic Analysis which are word-document co-occurrence topic models. A biterm consists of two words co-occurring in the same short text window. This context window can for example be a twitter message, a short answer on a survey, a sentence of a text or a document identifier. The techniques are explained in detail in the paper 'A Biterm Topic Model For Short Text' by Xiaohui Yan, Jiafeng Guo, Yanyan Lan, Xueqi Cheng (2013) <>.


Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.


0.2 by Jan Wijffels, 25 days ago

Browse source code at

Authors: Jan Wijffels [aut, cre, cph] (R wrapper) , BNOSAC [cph] (R wrapper) , Xiaohui Yan [ctb, cph] (BTM C++ library)

Documentation:   PDF Manual  

Apache License 2.0 license

Imports Rcpp, utils

Suggests udpipe

Linking to Rcpp

System requirements: C++11

See at CRAN