Abstract: We describe and evaluate two algorithms for Neyman-Pearson (NP) classification problem which has been recently shown to be of a particular importance for bipartite ranking problems. NP classification is a nonconvex problem involving a constraint on false negatives rate. We investigated batch algorithm based on DC programming and stochastic gradient method well suited for large scale datasets. Empirical evidence illustrates the potential of the proposed methods.
Preprint: tist-2010.djvu tist-2010.pdf tist-2010.ps.gz
@article{gasso-2010, author = {Gasso, Gilles and Pappaioannou, Aristidis and Spivak, Marina and Bottou, L\'{e}on}, title = {Batch and online learning algorithms for nonconvex {Neyman-Pearson} classification}, journal = {ACM Transaction on Intelligent System and Technologies}, volume = {2}, number = {3}, year = {2011}, url = {http://leon.bottou.org/papers/gasso-2010}, }
This paper studies classification problems where one wishes, for instance, to optimize precision for a fixed recall. The paper gives two algorithm that perform very well in practice and scale significantly better than the alternatives. The appendix gives a number of theoretical results about duality in nonconvex optimization problems. These results are simple but surprisingly powerful.