Differences

This shows you the differences between two versions of the page.

--- papers:bottou-lecun-2004 [2006/04/12 19:14]
127.0.0.1 (old revision restored)
+++ papers:bottou-lecun-2004 [2018/12/06 10:00] (current)
leonb
@@ Line 1: / Line 1: @@
 ===== Large Scale Online Learning =====
+//Abstract//:
+We consider situations where training data is abundant and computing
+resources are comparatively scarce. We argue that suitably designed
+online learning algorithms asymptotically outperform any batch
+learning algorithm. Both theoretical and experimental evidences are
+presented.
 <box 99% orange>
-Léon Bottou and Yann LeCun: Large Scale Online Learning,  //Advances in Neural Information Processing Systems 16//, Edited by Sebastian Thrun, Lawrence Saul and Bernhard Schölkopf, MIT Press, Cambridge, MA, 2004.
+Léon Bottou and Yann LeCun: **Large Scale Online Learning**,  //Advances in Neural Information Processing Systems 16
+(NIPS 2003)//, Edited by Sebastian Thrun, Lawrence Saul and Bernhard Schölkopf, MIT Press, Cambridge, MA, 2004.
 [[http://leon.bottou.org/publications/djvu/nips-2003.djvu|nips-2003.djvu]]
@@ Line 11: / Line 18: @@
 </box>
-  @inproceedings{bottou-lecun-2004,
+  @incollection{bottou-lecun-2004,
     author = {Bottou, L\'{e}on and {LeCun}, Yann},
     title = {Large Scale Online Learning},
-    booktitle = {Advances in Neural Information Processing Systems 16},
+    booktitle = {Advances in Neural Information Processing Systems 16 (NIPS 2003)},
     editor = {Thrun, Sebastian and Saul, Lawrence and Bernhard {Sch\"{o}lkopf}},
     publisher = {MIT Press},
@@ Line 21: / Line 28: @@
     url = {http://leon.bottou.org/papers/bottou-lecun-2004},
   }
+==== Notes ====
+The ASMB version of this work [[bottou-lecun-2004a|(Bottou and LeCun, 2005)]] contains the complete proof. The NIPS version was written several months after the ASMB. It contains only a proof sketch but reports experimental results. It also offers a better discussion of the previous results obtained by Murata and Amari (1998) which are in fact much more general than I initially realized. Relative to these results, this work contains three contributions: (a) spelling out the computational consequences of the efficiency results, (b) expressing the solution of the batch learning problems as a stochastic process that is fundamentally similar to a second order stochastic gradient algorithm, and (<html></html>c) providing a more rigorous proof, taking into account, for instance, how the gain matrix approximates the inverse Hessian.

User Tools

Site Tools

Differences

Page Tools