User Tools

Site Tools

The Tradeoffs of Large Scale Learning

Abstract: This contribution develops a theoretical framework that takes into account the effect of approximate optimization on learning algorithms. The analysis shows distinct tradeoffs for the case of small-scale and large-scale learning problems. Small-scale learning problems are subject to the usual approximation–estimation tradeoff. Large-scale learning problems are subject to a qualitatively different tradeoff involving the computational complexity of the underlying optimization algorithms in non-trivial ways.

Léon Bottou and Olivier Bousquet: The Tradeoffs of Large Scale Learning, Advances in Neural Information Processing Systems 20 (NIPS 2007), 161–168, Edited by J.C. Platt, D. Koller, Y. Singer and S. Roweis, NIPS Foundation (, 2008.

nips-2007.djvu nips-2007.pdf

  author = {Bottou, L\'{e}on and Bousquet, Olivier},
  title = {The Tradeoffs of Large Scale Learning},
  booktitle = {Advances in Neural Information Processing Systems 20 (NIPS 2007)},
  pages = {161--168},
  publisher = {NIPS Foundation (},
  year = {2008},
  editor = {Platt, J.C. and Koller, D. and Singer, Y. and Roweis, S.},
  url = {},


This paper establishes a formal distinction between small-scale and large-scale learning systems. The generalization performance of small-scale systems is entirely defined by the statistical properties of their estimation procedure. The generalization performance of large-scale systems also depends on the chosen optimization algorithm. In particular, stochastic gradient algorithms display very good generalization performances despite being very poor optimization algorithms. Here is a very clear experimental validation.

papers/bottou-bousquet-2008.txt · Last modified: 2018/12/06 09:40 by leonb

Page Tools