R Implementation of Wordpiece Tokenization

Apply 'Wordpiece' () tokenization to input text, given an appropriate vocabulary. The 'BERT' () tokenization conventions are used by default.


News

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.

install.packages("wordpiece")

1.0.2 by Jonathan Bratt, 21 days ago


https://github.com/jonathanbratt/wordpiece


Report a bug at https://github.com/jonathanbratt/wordpiece/issues


Browse source code at https://github.com/cran/wordpiece


Authors: Jonathan Bratt [aut, cre] , Jon Harmon [aut] , Bedford Freeman & Worth Pub Grp LLC DBA Macmillan Learning [cph]


Documentation:   PDF Manual  


Apache License (>= 2) license


Imports digest, purrr, rappdirs, stringi

Suggests testthat, knitr, rmarkdown, covr


See at CRAN