Taxonomic Information from 'Wikipedia'

'Taxonomic' information from 'Wikipedia', 'Wikicommons', 'Wikispecies', and 'Wikidata'. Functions included for getting taxonomic information from each of the sources just listed, as well performing taxonomic search.


Project Status: Active – The project has reached a stable, usable state and is being actively developed. cran checks Build Status codecov rstudio mirror downloads cran version

wikitaxa - taxonomy data from Wikipedia/Wikidata/Wikispecies

Get started below and with the vignette: https://cran.r-project.org/package=wikitaxa

See also the taxize book: https://ropensci.github.io/taxize-book/

The low level API is meant for power users and gives you more control, but requires more knowledge.

  • wt_wiki_page()
  • wt_wiki_page_parse()
  • wt_wiki_url_build()
  • wt_wiki_url_parse()
  • wt_wikispecies_parse()
  • wt_wikicommons_parse()
  • wt_wikipedia_parse()

High level API

The high level API is meant to be easier and faster to use.

  • wt_data()
  • wt_data_id()
  • wt_wikispecies()
  • wt_wikicommons()
  • wt_wikipedia()

Search functions:

  • wt_wikicommons_search()
  • wt_wikispecies_search()
  • wt_wikipedia_search()

Installation

CRAN version

install.packages("wikitaxa")

Dev version

install.packages("devtools")
devtools::install_github("ropensci/wikitaxa")
library('wikitaxa')

wiki data

wt_data("Poa annua")

Get a Wikidata ID

wt_data_id("Mimulus foliatus")
#> [1] "Q6495130"
#> attr(,"class")
#> [1] "wiki_id"

wikipedia

lower level

pg <- wt_wiki_page("https://en.wikipedia.org/wiki/Malus_domestica")
res <- wt_wiki_page_parse(pg)
res$iwlinks
#> [1] "https://commons.wikimedia.org/wiki/Category:apples"         
#> [2] "https://commons.wikimedia.org/wiki/Category:Apple_cultivars"
#> [3] "https://www.wikidata.org/wiki/Q158657"                      
#> [4] "https://www.wikidata.org/wiki/Q18674606"                    
#> [5] "https://species.wikimedia.org/wiki/Malus_pumila"            
#> [6] "https://species.wikimedia.org/wiki/Malus_domestica"

higher level

res <- wt_wikipedia("Malus domestica")
res$common_names
#> # A tibble: 1 x 2
#>   name  language
#>   <chr> <chr>   
#> 1 Apple en
res$classification
#> # A tibble: 3 x 2
#>   rank       name        
#>   <chr>      <chr>       
#> 1 plainlinks ""          
#> 2 species    M. pumila   
#> 3 binomial   Malus pumila

choose a wikipedia language

# French
wt_wikipedia(name = "Malus domestica", wiki = "fr")
# Slovak
wt_wikipedia(name = "Malus domestica", wiki = "sk")
# Vietnamese
wt_wikipedia(name = "Malus domestica", wiki = "vi")

wikicommons

lower level

pg <- wt_wiki_page("https://commons.wikimedia.org/wiki/Abelmoschus")
res <- wt_wikicommons_parse(pg)
res$common_names[1:3]
#> [[1]]
#> [[1]]$name
#> [1] "okra"
#> 
#> [[1]]$language
#> [1] "en"
#> 
#> 
#> [[2]]
#> [[2]]$name
#> [1] "مسكي"
#> 
#> [[2]]$language
#> [1] "ar"
#> 
#> 
#> [[3]]
#> [[3]]$name
#> [1] "Abelmoş"
#> 
#> [[3]]$language
#> [1] "az"

higher level

res <- wt_wikicommons("Abelmoschus")
res$classification
#> # A tibble: 15 x 2
#>    rank       name            
#>    <chr>      <chr>           
#>  1 Domain     Eukaryota       
#>  2 unranked   Archaeplastida  
#>  3 Regnum     Plantae         
#>  4 Cladus     angiosperms     
#>  5 Cladus     eudicots        
#>  6 Cladus     core eudicots   
#>  7 Cladus     superrosids     
#>  8 Cladus     rosids          
#>  9 Cladus     eurosids II     
#> 10 Ordo       Malvales        
#> 11 Familia    Malvaceae       
#> 12 Subfamilia Malvoideae      
#> 13 Tribus     Hibisceae       
#> 14 Genus      Abelmoschus     
#> 15 Authority  " Medik. (1787)"
res$common_names
#> # A tibble: 19 x 2
#>    name             language
#>    <chr>            <chr>   
#>  1 okra             en      
#>  2 مسكي             ar      
#>  3 Abelmoş          az      
#>  4 Ibiškovec        cs      
#>  5 Bisameibisch     de      
#>  6 Okrat            fi      
#>  7 Abelmosco        gl      
#>  8 Abelmošus        hr      
#>  9 Ybiškė           lt      
#> 10 അബെൽമോസ്കസ്        ml      
#> 11 Абельмош         mrj     
#> 12 Abelmoskusslekta nn      
#> 13 Piżmian          pl      
#> 14 Абельмош         ru      
#> 15 موري             sd      
#> 16 Okrasläktet      sv      
#> 17 Абельмош         udm     
#> 18 Chi Vông vang    vi      
#> 19 黄葵属           zh

wikispecies

lower level

pg <- wt_wiki_page("https://species.wikimedia.org/wiki/Malus_domestica")
res <- wt_wikispecies_parse(pg, types = "common_names")
res$common_names[1:3]
#> [[1]]
#> [[1]]$name
#> [1] "Ябълка"
#> 
#> [[1]]$language
#> [1] "български"
#> 
#> 
#> [[2]]
#> [[2]]$name
#> [1] "Poma, pomera"
#> 
#> [[2]]$language
#> [1] "català"
#> 
#> 
#> [[3]]
#> [[3]]$name
#> [1] "jabloň domácí"
#> 
#> [[3]]$language
#> [1] "čeština"

higher level

res <- wt_wikispecies("Malus domestica")
res$classification
#> # A tibble: 8 x 2
#>   rank        name         
#>   <chr>       <chr>        
#> 1 Superregnum Eukaryota    
#> 2 Regnum      Plantae      
#> 3 Cladus      Angiosperms  
#> 4 Cladus      Eudicots     
#> 5 Cladus      Core eudicots
#> 6 Cladus      Rosids       
#> 7 Cladus      Eurosids I   
#> 8 Ordo        Rosales
res$common_names
#> # A tibble: 22 x 2
#>    name          language 
#>    <chr>         <chr>    
#>  1 Ябълка        български
#>  2 Poma, pomera  català   
#>  3 jabloň domácí čeština  
#>  4 Apfel         Deutsch  
#>  5 Aed-õunapuu   eesti    
#>  6 Μηλιά         Ελληνικά 
#>  7 Apple         English  
#>  8 Manzano       español  
#>  9 Pomme         français 
#> 10 Melâr         furlan   
#> # ... with 12 more rows

Contributors

Meta

  • Please report any issues or bugs.
  • License: MIT
  • Get citation information for wikitaxa in R doing citation(package = 'wikitaxa')
  • Please note that this project is released with a Contributor Code of Conduct. By participating in this project you agree to abide by its terms.

ropensci

News

wikitaxa 0.3.0

MINOR IMPROVEMENTS

  • integration with vcr for test caching for all HTTP requests (#17) (#18)
  • link to taxize book and wikitaxa vignette in readme (#16)

BUG FIXES

  • fix to wt_wikipedia() to separate <br> tags appropriately (#15)

wikitaxa 0.2.0

BUG FIXES

  • wt_wikicommons() fails better now when a page does not exist, and is now consitent with the rest of package (#14)
  • wt_wikicommons() fixed - classification objects were not working correctly as the data used is a hot mess - tried to improve parsing of that text (#13)
  • wt_data() fix - was failing due to i think a change in the internal pkg WikidataR (#12)

wikitaxa 0.1.4

NEW FEATURES

  • wt_wikipedia() and wt_wikipedia_search() gain parameter wiki to give the wiki language, which defaults to en (#9)

MINOR IMPROVEMENTS

  • move some examples to dontrun (#11)

wikitaxa 0.1.0

NEW FEATURES

  • Released to CRAN

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.

install.packages("wikitaxa")

0.3.0 by Scott Chamberlain, 10 months ago


https://github.com/ropensci/wikitaxa


Report a bug at https://github.com/ropensci/wikitaxa/issues


Browse source code at https://github.com/cran/wikitaxa


Authors: Scott Chamberlain [aut, cre] , Ethan Welty [aut]


Documentation:   PDF Manual  


MIT + file LICENSE license


Imports WikidataR, data.table, curl, crul, tibble, jsonlite, xml2

Suggests roxygen2, testthat, knitr, rmarkdown, vcr


Imported by taxize, taxotools.


See at CRAN