Parallel Support Vector Machines: The Cascade SVM

Abstract: We describe an algorithm for support vector machines (SVM) that can be parallelized efficiently and scales to very large problems with hundreds of thousands of training vectors. Instead of searching the whole training set for support vectors in one optimization step, the data are split into subsets and analyzed separately with multiple SVMs. The partial results are combined and filtered again in a Cascade of SVMs until the global optimum is reached. The Cascade SVM can be spread over multiple processors with minimal communication overhead and scales super-linear for a small number of subdivisions. That means even simulations on a single processor are faster than one optimization. Accelerations of 5-10x are measured for problems of 100,000 training vectors on a single processor and more than 20x on a cluster with 16 processors.

Hans Peter Graf, Eric Cosatto, Léon Bottou, Igor Dourdanovic and Vladimir Vapnik: Parallel Support Vector Machines: The Cascade SVM, Advances in Neural Information Processing Systems 17 (NIPS 2004), 521-528, Edited by Lawrence Saul, Yair Weiss and Léon Bottou, MIT Press, 2005.

nips-2004c.djvu nips-2004c.pdf nips-2004c.ps.gz

@incollection{graf-cosatto-2005,
  author = {Graf, Hans Peter and Cosatto, Eric and Bottou, L\'{e}on and Dourdanovic, Igor and Vapnik, Vladimir},
  title = {Parallel Support Vector Machines: {The Cascade SVM}},
  booktitle = {Advances in Neural Information Processing Systems 17 (NIPS 2004)},
  pages = {521-528},
  editor = {Saul, Lawrence and Weiss, Yair and Bottou, L\'{e}on},
  publisher = {MIT Press},
  year = {2005},
  url = {http://leon.bottou.org/papers/graf-cosatto-2005},
}