Tools to create, modify and manage 'CWB' Corpora

The 'Corpus Workbench' ('CWB', < http://cwb.sourceforge.net/>) offers a classic and mature approach for working with large, linguistically and structurally annotated corpora. The 'CWB' is memory efficient and its design makes running queries fast (Evert and Hardie 2011, < http://www.stefan-evert.de/PUB/EvertHardie2011.pdf>). The 'cwbtools' package offers pure R tools to create indexed corpus files as well as high-level wrappers for the original C implementation of CWB as exposed by the 'RcppCWB' package < https://CRAN.R-project.org/package=RcppCWB>. Additional functionality to add and modify annotations of corpora from within R makes working with CWB indexed corpora much more flexible and convenient. The 'cwbtools' package in combination with the R packages 'RcppCWB' (< https://CRAN.R-project.org/package=RcppCWB>) and 'polmineR' (< https://CRAN.R-project.org/package=polmineR>) offers a lightweight infrastructure to support the combination of quantitative and qualitative approaches for working with textual data.


News

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.

install.packages("cwbtools")

0.1.1 by Andreas Blaette, 6 days ago


Browse source code at https://github.com/cran/cwbtools


Authors: Andreas Blaette [aut, cre] , Christoph Leonhardt [ctb]


Documentation:   PDF Manual  


GPL-3 license


Imports data.table, R6, xml2, stringi, curl, RcppCWB, pbapply, methods

Suggests tm, knitr, tokenizers, tidytext, SnowballC, janeaustenr, devtools, polmineR, NLP


See at CRAN