A wrapper for the homologene database by the National Center for Biotechnology Information ('NCBI'). It allows searching for gene homologs across species. Data in this package can be found at < ftp://ftp.ncbi.nih.gov/pub/HomoloGene/build68/>. The package also includes an updated version of the homologene database where gene identifiers and symbols are replaced with their latest (at the time of submission) version and functions to fetch latest annotation data to keep updated.
An r package that works as a wrapper to homologene
Available species are
More species can be added on request
Basic homologene function requires a list of gene symbols or NCBI ids, and an
inTax and an
outTax. In this example,
inTax is the taxon id of mus musculus while
outTax is for humans.
homologene(c('Eno2','Mog'), inTax = 10090, outTax = 9606)
## 1 Eno2 ENO2 13807 2026 ## 2 Mog MOG 17441 4340
homologene(c('Eno2','17441'), inTax = 10090, outTax = 9606)
## 10090 9606 10090_ID 9606_ID ## 1 Eno2 ENO2 13807 2026 ## 2 Mog MOG 17441 4340
For mouse and humans two convenience functions exist that removes the need to provide taxonomic identifiers. Note that the column names are not the same as the
## mouseGene humanGene mouseID humanID ## 1 Eno2 ENO2 13807 2026 ## 2 Mog MOG 17441 4340
## humanGene mouseGene humanID mouseID ## 1 ENO2 Eno2 2026 13807 ## 2 MOG Mog 4340 17441 ## 3 GZMH Gzmd 2999 14941 ## 4 GZMH Gzme 2999 14942 ## 5 GZMH Gzmg 2999 14944 ## 6 GZMH Gzmf 2999 14943
As of version version 1.1.68, the output now includes NCBI ids. Since it doesn't change any of the existing column names or their order, this shouldn't cause problems in most use cases. If this is an issue for you plase notify me.
If a you can't find a gene you are looking for it may have synonyms. See geneSynonym package to find them. If you have other problems open an issue or send a mail.