Classification performance metrics that are derived from the ROC curve of a classifier. The package includes the H-measure performance metric as described in < http://link.springer.com/article/10.1007/s10994-009-5119-5>, which computes the minimum total misclassification cost, integrating over any uncertainty about the relative misclassification costs, as per a user-defined prior. It also offers a one-stop-shop for other scalar metrics of performance, including sensitivity, specificity and many others, and also offers plotting tools for ROC curves and related statistics.
hmeasure package implements a large number of classification performance metrics, such as the AUC, Error Rate, sensitivity and specificity, and additionally extends standard libraries by incorporating recent advances in this area, notably including the H-measure which was proposed by David Hand as coherent alternative to the AUC, and further developed by Hand and Anagnostopoulos. For more info, please visit hmeasure.net.
This package aspires to become a one-stop-shop for classification performance metrics, and to support implementations in all popular data science languages. At the moment it is already covering more metrics than any other package in R, and is the only one that covers the H-measure, Specificity at fixed Sensitivity and Minimum Error Rate.
hmeasure packages relies on the scoring interface that most classifiers support, whereby each example is assigned a score, so that higher values indicate that the example is more likely to be labelled positive. For many classifiers, this scores is intended to be a probability and lies between 0 and 1, but that is not a requirement for the package. The main function,
HMeasure, takes as input the true labels, and a data frame where each column contains the scores of a classifier applied on a certain dataset. You may run the following example where we directly specify the scores to be a noisy version of the label, with more noise for classifier A than classifier B (this is known as a the 'binormal' case in the literature):
library(hmeasure) n = 50 set.seed(1) y = c(rep(1, n), rep(0, n)) scores = data.frame( A=c(rnorm(n,0,1), rnorm(n,0.5,1)), B=c(rnorm(n,0,1), rnorm(n,2,1)) ) out = HMeasure(true.class = y, scores = scores) summary(out)
The package also supports a plotting routine that visualizes the ROC curve, as well as some additional interesting plots that concern specifically the H-measure:
The package is available on CRAN and can be downloaded and installed as per usual:
For the most recent version, you can install the github version instead: