Differences

This shows you the differences between two versions of the page.

--- papers:bottou-lecun-2004 [2006/04/18 17:47]
leonb
+++ papers:bottou-lecun-2004 [2018/12/06 10:00] (current)
leonb
@@ Line 9: / Line 9: @@
 <box 99% orange>
-Léon Bottou and Yann LeCun: Large Scale Online Learning,  //Advances in Neural Information Processing Systems 16//, Edited by Sebastian Thrun, Lawrence Saul and Bernhard Schölkopf, MIT Press, Cambridge, MA, 2004.
+Léon Bottou and Yann LeCun: **Large Scale Online Learning**,  //Advances in Neural Information Processing Systems 16
+(NIPS 2003)//, Edited by Sebastian Thrun, Lawrence Saul and Bernhard Schölkopf, MIT Press, Cambridge, MA, 2004.
 [[http://leon.bottou.org/publications/djvu/nips-2003.djvu|nips-2003.djvu]]
@@ Line 16: / Line 18: @@
 </box>
-  @inproceedings{bottou-lecun-2004,
+  @incollection{bottou-lecun-2004,
     author = {Bottou, L\'{e}on and {LeCun}, Yann},
     title = {Large Scale Online Learning},
-    booktitle = {Advances in Neural Information Processing Systems 16},
+    booktitle = {Advances in Neural Information Processing Systems 16 (NIPS 2003)},
     editor = {Thrun, Sebastian and Saul, Lawrence and Bernhard {Sch\"{o}lkopf}},
     publisher = {MIT Press},
@@ Line 29: / Line 31: @@
 ==== Notes ====
-Complete proofs can be found in [[bottou-lecun-2004a|(Bottou and LeCun, 2004a)]].
+The ASMB version of this work [[bottou-lecun-2004a|(Bottou and LeCun, 2005)]] contains the complete proof. The NIPS version was written several months after the ASMB. It contains only a proof sketch but reports experimental results. It also offers a better discussion of the previous results obtained by Murata and Amari (1998) which are in fact much more general than I initially realized. Relative to these results, this work contains three contributions: (a) spelling out the computational consequences of the efficiency results, (b) expressing the solution of the batch learning problems as a stochastic process that is fundamentally similar to a second order stochastic gradient algorithm, and (<html></html>c) providing a more rigorous proof, taking into account, for instance, how the gain matrix approximates the inverse Hessian.

User Tools

Site Tools

Differences

Page Tools