Breaking SVM Complexity with Cross-Training

Abstract: We propose to selectively remove examples from the training set using probabilistic estimates related to editing algorithms (Devijver and Kittler, 1982). This heuristic procedure aims at creating a separable distribution of training examples with minimal impact on the position of the decision boundary. It breaks the linear dependency between the number of SVs and the number of training examples, and sharply reduces the complexity of SVMs during both the training and prediction stages.

Gökhan Bakir, Léon Bottou and Jason Weston: Breaking SVM Complexity with Cross-Training, Advances in Neural Information Processing Systems 17 (NIPS 2004), 81-88, Edited by Lawrence Saul, Yair Weiss and Léon Bottou, MIT Press, 2005.

nips-2004e.djvu nips-2004e.pdf nips-2004e.ps.gz

@incollection{bakir-bottou-weston-2005,
  author = {Bak{\i}r, G\"{o}khan and Bottou, L\'{e}on and Weston, Jason},
  title = {Breaking SVM Complexity with Cross-Training},
  booktitle = {Advances in Neural Information Processing Systems 17 (NIPS 2004)},
  editor = {Saul, Lawrence and Weiss, Yair and Bottou, L\'{e}on},
  publisher = {MIT Press},
  year = {2005},
  pages = {81-88},
  url = {http://leon.bottou.org/papers/bakir-bottou-weston-2005},
}

Notes

Much better solutions for this problem are discussed in (Bordes et al., 2006) and (Collobert et al., 2006).