The Tradeoffs of Large-scale Learning

Summary

Pervasive and networked computers have reduced the cost of collecting and distributing large-scale datasets. Since usual machine learning algorithms demand computing times that grow faster than the volume of the data, computing time is now the bottleneck in real life applications.

The first part of this presentation clarifies the relation between the statistical efficiency, the design of learning algorithms and their computational cost. The analysis shows distinct tradeoffs for the case of small-scale and large-scale learning problems. Small-scale learning problems are subject to the usual approximation–estimation tradeoff. Large-scale learning problems are subject to a qualitatively different tradeoff involving the computational complexity of the underlying optimization algorithms in non-trivial ways. For instance, a mediocre optimization algorithms, stochastic gradient descent, is shown to perform very well on large-scale learning problems. The second part makes a detailled exploration of stochastic gradient descent learning algorithms. Simple and complex experiments are discussed. The source code and datasets are available online. The third part discusses algorithms that learn with a single pass over the data.

Papers

Léon Bottou and Olivier Bousquet: The Tradeoffs of Large Scale Learning, Advances in Neural Information Processing Systems, 20, MIT Press, Cambridge, MA, 2008.

more...

talks/largescale.txt · Last modified: 2014/12/18 21:03 by leonb
Driven by DokuWiki Recent changes RSS feed Valid CSS Valid XHTML 1.0