In many binary classification applications, such as disease
diagnosis and spam detection, practitioners commonly face the need to limit
type I error (i.e., the conditional probability of misclassifying a class 0
observation as class 1) so that it remains below a desired threshold. To address
this need, the Neyman-Pearson (NP) classification paradigm is a natural choice;
it minimizes type II error (i.e., the conditional probability of misclassifying
a class 1 observation as class 0) while enforcing an upper bound, alpha, on the
type I error. Although the NP paradigm has a century-long history in hypothesis
testing, it has not been well recognized and implemented in classification
schemes. Common practices that directly limit the empirical type I error to
no more than alpha do not satisfy the type I error control objective because
the resulting classifiers are still likely to have type I errors much larger
than alpha. As a result, the NP paradigm has not been properly implemented
for many classification scenarios in practice. In this work, we develop the
first umbrella algorithm that implements the NP paradigm for all scoring-type
classification methods, including popular methods such as logistic regression,
support vector machines and random forests. Powered by this umbrella algorithm,
we propose a novel graphical tool for NP classification methods: NP receiver
operating characteristic (NP-ROC) bands, motivated by the popular receiver
operating characteristic (ROC) curves. NP-ROC bands will help choose in a data
adaptive way and compare different NP classifiers. The paper is available at