Batch and online learning algorithms for nonconvex Neyman-Pearson classification

Abstract: We describe and evaluate two algorithms for Neyman-Pearson (NP) classification problem which has been recently shown to be of a particular importance for bipartite ranking problems. NP classification is a nonconvex problem involving a constraint on false negatives rate. We investigated batch algorithm based on DC programming and stochastic gradient method well suited for large scale datasets. Empirical evidence illustrates the potential of the proposed methods.

Gilles Gasso, Aristidis Pappaioannou, Marina Spivak and Léon Bottou: Batch and online learning algorithms for nonconvex Neyman-Pearson classification, ACM Transaction on Intelligent System and Technologies, 2(3), 2011.

Preprint: tist-2010.djvu tist-2010.pdf tist-2010.ps.gz

@article{gasso-2010,
  author = {Gasso, Gilles and Pappaioannou, Aristidis 
            and Spivak, Marina and Bottou, L\'{e}on},
  title = {Batch and online learning algorithms for nonconvex {Neyman-Pearson} classification},
  journal = {ACM Transaction on Intelligent System and Technologies},
  volume = {2},
  number = {3},
  year = {2011},
  url = {http://leon.bottou.org/papers/gasso-2010},
}

Notes

This paper studies classification problems where one wishes, for instance, to optimize precision for a fixed recall. The paper gives two algorithm that perform very well in practice and scale significantly better than the alternatives. The appendix gives a number of theoretical results about duality in nonconvex optimization problems. These results are simple but surprisingly powerful.