N-Gram Analysis of Biological Sequences

Tools for extraction and analysis of various n-grams (k-mers) derived from biological sequences (proteins or nucleic acids). Contains QuiPT (quick permutation test) for fast feature-filtering of the n-gram data.

CRAN_Status_Badge Downloads Build Status codecov.io

biogram package

This package contains tools for extraction and analysis of various n-grams (sequences of n items) derived from biological sequences (proteins or nucleic acids). To deal with the curse of dimensionality of the n-grams, biogram uses Quick Permutation Test (QuiPT) for fast feature filtering.


biogram is available on CRAN, so installation is as simple as:


You can install the latest development version of the code using the devtools R package.



For citation type:


or use: Michal Burdukiewicz, Piotr Sobczyk and Chris Lauber (2016). biogram: N-Gram Analysis of Biological Sequences. R package version 1.3. https://cran.r-project.org/package=biogram


Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.


1.6.3 by Michal Burdukiewicz, a year ago


Report a bug at https://github.com/michbur/biogram/issues

Browse source code at https://github.com/cran/biogram

Authors: Michal Burdukiewicz [cre, aut] , Piotr Sobczyk [aut] , Chris Lauber [aut] , Dominik Rafacz [aut] , Katarzyna Sidorczuk [ctb]

Documentation:   PDF Manual  

GPL-3 license

Imports combinat, entropy, partitions

Depends on slam

Suggests ggplot2, knitr, testthat

Imported by AmpGram, AmyloGram, CancerGram.

See at CRAN