High-Performance Stemmer, Tokenizer, and Spell Checker for R

A spell checker and morphological analyzer library designed for languages with rich morphology and complex word compounding or character encoding. The package can check and analyze individual words as well as search for incorrect words within a text, latex, html or xml document. Use the 'devtools' package to spell check R documentation with 'hunspell'.


languages with rich morphology and complex word compounding or character encoding. The package can check and analyze individual words as well as search for incorrect words within a text, latex, html or xml document. Use the 'devtools' package to spell check R documentation with 'hunspell'.

This package includes a bundled version of libhunspell and no longer depends on external system libraries:

install.packages("hunspell")

About the R package:

# Check individual words
words <- c("beer", "wiskey", "wine")
correct <- hunspell_check(words)
print(correct)
 
# Find suggestions for incorrect words
hunspell_suggest(words[!correct])
 
# Extract incorrect from a piece of text
bad <- hunspell("spell checkers are not neccessairy for langauge ninja's")
print(bad[[1]])
hunspell_suggest(bad[[1]])
 
# Stemming
words <- c("love", "loving", "lovingly", "loved", "lover", "lovely", "love")
hunspell_stem(words)
hunspell_analyze(words)

The devtools package uses this package to spell R package documentation:

# Spell check a package
library(devtools)
spell_check("~/mypackage")

News

2.2

  • Tweak code to make it build on old compilers (CentOS6 / gcc 4.4.7)

2.1

  • Update upstream to a6d32ee
  • Rebuild vignettes to fix CMD check timestamp warning

2.0

  • Added a beautiful intro vignette
  • Dictionaries are now their own class and get cached automatically via memoise
  • Make sure UTF-8 return values are marked properly. Fixes #16
  • Update libhunspell to upstream 4b43843

1.4.3

  • Fix UBSAN bug
  • Remove unused 'config.h' file (see upstream 2ccf840)

1.4.2

  • Switch to R's iconv wrapper which is more portable (thnx BDR)

1.4.1

  • Change license to cover libhunspell (per CRAN request).

1.4

  • Switch to bundled libhunspell because their API keeps breaking
  • Include libhunspell 1.5-pre (b13e62a)
  • Add parsers for HTML/XML formats

1.2

  • (Breaking) Rename 'hunspell_find' to 'hunspell'
  • Add support for other dictionaries
  • Use iconv() to convert encoding before checking
  • Use the 'en_stats' dict as default ignore list

1.1

  • Switch to hunspell parsers (replaced 'delim' with 'format' parameter)

1.0

  • Initial CRAN release

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.

install.packages("hunspell")

2.6 by Jeroen Ooms, 2 months ago


https://github.com/ropensci/hunspell#readme (devel) https://hunspell.github.io (upstream)


Report a bug at https://github.com/ropensci/hunspell/issues


Browse source code at https://github.com/cran/hunspell


Authors: Jeroen Ooms


Documentation:   PDF Manual  


Task views: Natural Language Processing


GPL-2 | LGPL-2.1 | MPL-1.1 license


Imports Rcpp, digest

Suggests testthat, devtools, pdftools, janeaustenr, wordcloud2, knitr, rmarkdown

Linking to Rcpp


Imported by BrailleR, TeXCheckR, hrbrthemes, msgtools, ptstem, textstem, tidytext.

Suggested by SpaDES.tools, devtools, fivethirtyeight, quickPlot, reproducible.


See at CRAN