*Abstract*:
This contribution develops a theoretical framework that takes into account
the effect of approximate optimization on learning algorithms.
The analysis shows distinct tradeoffs for the case of small-scale and large-scale learning
problems. Small-scale learning problems are subject to the usual
approximationâ€“estimation tradeoff. Large-scale learning problems are subject to a
qualitatively different tradeoff involving the computational complexity of the
underlying optimization algorithms in non-trivial ways. For instance,
a mediocre optimization algorithms, stochastic gradient descent, is
shown to perform very well on large-scale learning problems.

LĂ©on Bottou and Olivier Bousquet: **Learning Using Large Datasets**, *Mining Massive DataSets for Security*, NATO ASI Workshop Series, IOS Press, Amsterdam, 2008.

@incollection{bottou-bousquet-2008b, author = {Bottou, L\'{e}on and Bousquet, Olivier}, title = {Learning Using Large Datasets}, booktitle = {{Mining Massive DataSets for Security}}, publisher = {{IOS} {Press}}, address = {Amsterdam}, year = {2008}, series = {{NATO} {ASI} Workshop Series}, note = {to appear}, url = {http://leon.bottou.org/papers/bottou-bousquet-2008b}, }

Slightly expanded version of the original NIPS version reporting some experimental results from the SGD pages.