A testing workbench for evaluating precision-recall curves under various conditions.
The aim of the prcbench
package is to provide a testing workbench for evaluating precision-recall curves under various conditions. It contains integrated interfaces for the following five tools. It also contains predefined test data sets.
Tool | Link |
---|---|
ROCR | Tool web site, CRAN |
AUCCalculator | Tool web site |
PerfMeas | CRAN |
PRROC | CRAN |
precrec | Tool web site, CRAN |
AUCCalculator
requires a Java runtime (>= 6).
PerfMeas
requires Bioconductor libraries. To automatically install the dependencies, add a Bioconductor repository to the repository list as:
## Include a Bioconductor repositorysetRepositories(ind = 1:2)
Install the release version of prcbench
from CRAN with install.packages("prcbench")
.
Alternatively, you can install a development version of prcbench
from our GitHub repository. To install it:
Make sure you have a working development environment.
Install devtools
from CRAN with install.packages("devtools")
.
Install prcbench
from the GitHub repository with devtools::install_github("takayasaito/prcbench")
.
You can manually install the dependencies from Bioconductor if install.packages
fails to access the Bioconductor repository.
## try http:// if https:// URLs are not supportedsource("https://bioconductor.org/biocLite.R")biocLite("limma")biocLite("graph")biocLite("RBGL")
Some OSs require further configuration for rJava.
Use:
Sys.setenv(JAVA_HOME = "<path to JRE>")
or
#!/bin/bash
export JAVA_HOME = "<path to JRE>"
R CMD javareconf
microbenchmark does not work on some OSs. prcbench
uses system.time
when microbenchmark
is not available.
Introduction to prcbench - a package vignette that contains the descriptions of the functions with several useful examples. View the vignette with vignette("introduction", package = "prcbench")
in R. The HTML version is also available on the GitPages.
Help pages - all the functions including the S3 generics have their own help pages with plenty of examples. View the main help page with help(package = "prcbench")
in R. The HTML version is also available on the GitPages.
Following two examples show the basic usage of prcbench
functions.
The run_benchmark
function outputs the result of microbenchmark for specified tools.
## Load librarylibrary(prcbench)## Run microbenchmark for aut5 on b10testset <- create_testset("bench", "b10")toolset <- create_toolset(set_names = "auc5")res <- run_benchmark(testset, toolset)## Use knitr::kable to show the result in a table formatknitr::kable(res$tab, digits = 2)
testset | toolset | toolname | min | lq | mean | median | uq | max | neval |
---|---|---|---|---|---|---|---|---|---|
b10 | auc5 | ROCR | 2.59 | 2.81 | 80.21 | 2.91 | 168.70 | 224.05 | 5 |
b10 | auc5 | AUCCalculator | 5.02 | 5.07 | 21.70 | 5.13 | 35.56 | 57.71 | 5 |
b10 | auc5 | PerfMeas | 0.12 | 0.13 | 154.44 | 0.19 | 29.47 | 742.31 | 5 |
b10 | auc5 | PRROC | 0.27 | 0.27 | 49.32 | 0.30 | 48.54 | 197.23 | 5 |
b10 | auc5 | precrec | 7.67 | 7.71 | 180.34 | 7.90 | 216.46 | 661.93 | 5 |
The run_evalcurve
function evaluates precision-recall curves with predefined test datasets. The autoplot
shows a plot with the result of the run_evalcurve
function.
## ggplot2 is necessary to use autoplotlibrary(ggplot2)## Plot base points and the result of precrec on c1, c2, and c3 test setstestset <- create_testset("curve", c("c1", "c2", "c3"))toolset <- create_toolset("precrec")scores1 <- run_evalcurve(testset, toolset)autoplot(scores1)
## Plot the results of PerfMeas and PRROC on c1, c2, and c3 test setstoolset <- create_toolset(c("PerfMeas", "PRROC"))scores2 <- run_evalcurve(testset, toolset)autoplot(scores2, base_plot = FALSE)
Precrec: fast and accurate precision-recall and ROC curve calculations in R
Takaya Saito; Marc Rehmsmeier
Bioinformatics 2017; 33 (1): 145-147.
doi: 10.1093/bioinformatics/btw570
Classifier evaluation with imbalanced datasets - our web site that contains several pages with useful tips for performance evaluation on binary classifiers.
The Precision-Recall Plot Is More Informative than the ROC Plot When Evaluating Binary Classifiers on Imbalanced Datasets - our paper that summarized potential pitfalls of ROC plots with imbalanced datasets and advantages of using precision-recall plots instead.
Update README
Update wrapper functions so that precrec works when PerfMeas is not available
Change predifined C3 data
Update AppVeyor config for rJava
Enhance create_usrtool
Add a new test set
Add test categories to curve evaluation test result
Improve graph options
Improve the testing enviroment
Change Java version
Fix microbenchmark
Improve several documents
The first release version of prcbench
The package offers four main functions
The package contains predefined interfaces of the following five tool