Identify and Correct Invalid HGNC Human Gene Symbols and MGI Mouse Gene Symbols

Contains functions for identifying and correcting HGNC human gene symbols and MGI mouse gene symbols which have been converted to date format by Excel, withdrawn, or aliased. Also contains functions for reversibly converting between HGNC symbols and valid R names.

Travis-CI Build Status Coverage Status


To update the symbols maps for human and mouse yourself, download this repository and run:


from its root directory. Note that this script uses the "roxygen2" R library to update the documentation.

Alternatively, you can use updated maps without updating the package, see ?getCurrentMaps.

Updating gh-pages

Note to self - when updating the vignette, update the gh-pages website ( like this:

# pip install ghp-import
Rscript -e "devtools::build_vignettes()"
ghp-import inst/doc
git push



Changes in version 0.7.0

  • Use new download URL (
  • This includes non-coding RNA, "phenotype", pseudogene, protein-coding gene, and "other"

Changes in version 0.6.0

  • Add vignette

Changes in version 0.5.0

  • Add getCurrentHumanMap and getCurrentMouseMap

Changes in version 0.4.2

  • Support for checking mouse symbols added. See ?checkGeneSymbols and ?mouse.table

Changes in version 0.3.2

  • update map to Jan 17, 2016
  • Fixed issue #4 (

Changes in version 0.3.1

  • update map to Dec 3, 2014
  • fixed corner case of orf with incorrect capitalization - was being identified as invalid, but not corrected. Example was c21orf62-as1, which now gets corrected to C21orf62-AS1 - see tests/checkGeneSymbols.R.

Changes in version 0.3.0

  • add toupper() to hgnc.table[, 1]
  • add additional Excel date formats to inst/extdata/mog_map.csv
  • additional unit tests in man/checkGeneSymbols.Rd

Changes in version 0.2.6

  • fixed checkGeneSymbols(x) when x contains NAs.

Changes in version 0.2.5

  • update inst/hgncLookup.R to new webpage.
  • added hgnc.table argument to checkGeneSymbols() to allow optional specification of a more up-to-date map from
  • update to 2014/02/09 HGNC data.
  • move vignette to new location
  • GEO GPL analysis to inst/analyses, not included in build.

Changes in version 0.2.2

  • license to GPL >= 2.0 (from GPL > 2.0)
  • convert all putative gene symbols to upper-case, except for orf. This catches symbols that are non-standard because of lower-case letters.
  • added inst/extdata/genenames_org.csv for full transparency

Changes in version 0.2.1

  • changes to checkGeneSymbols:
    • Added option
    • set stringsAsFactors=FALSE so output columns are character class instead of factor
    • use return(df) instead of invisible(df).

Changes in version 0.2

  • Markus added checkGeneSymbols and unit tests.

Changes in version 0.1

  • Initial version uploaded to CRAN

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.


0.8.1 by Levi Waldron, 2 years ago

Report a bug at

Browse source code at

Authors: Levi Waldron and Markus Riester

Documentation:   PDF Manual  

GPL (>= 2.0) license

Depends on methods, utils

Suggests testthat, knitr, rmarkdown

Imported by MetaIntegrator.

Suggested by DGCA.

See at CRAN