Differences

This shows you the differences between two versions of the page.

--- papers:bordes-bottou-gallinari-2009 [2009/08/07 10:16]
leonb
+++ papers:bordes-bottou-gallinari-2009 [2017/11/29 10:27] (current)
leonb [Errata]
@@ Line 12: / Line 12: @@
 PASCAL Large Scale Learning Challenge.
-//Notes//:
+<html><font color=blue></html>
-The appendix contains a derivation of upper and lower bounds
+//Errata//:
-on the asymptotic convergence speed of stochastic gradient algorithm.
+Please see section [[#Errata]] below.
-This result is exact in the case of second order stochastic gradient.
+<html></font></html>
 <box 99% orange>
-Antoine Bordes, Léon Bottou and Patrick Gallinari:  **SGD-QN: Careful Quasi-Newton Stochastic Gradient Descent**,  //Journal of Machine Learning Research//, 10:1737--1754, 2009.
+Antoine Bordes, Léon Bottou and Patrick Gallinari:  **SGD-QN: Careful Quasi-Newton Stochastic Gradient Descent**,  //Journal of Machine Learning Research//, 10:1737--1754, July 2009.
 [[http://jmlr.csail.mit.edu/papers/v10/bordes09a.html|JMLR Link]]
+[[http://jmlr.csail.mit.edu/papers/v11/bordes10a.html|JMLR Erratum]]
+<html>&nbsp;&nbsp;</html>
 [[http://leon.bottou.org/publications/djvu/jmlr-2009.djvu|jmlr-2009.djvu]]
 [[http://leon.bottou.org/publications/pdf/jmlr-2009.pdf|jmlr-2009.pdf]]
@@ Line 34: / Line 35: @@
     volume = {10},
     pages = {1737--1754},
+    month = {July},
     url = {http://leon.bottou.org/papers/bordes-bottou-gallinari-2009},
   }
+==== Implementation ====
+The complete source code of
+[[http://webia.lip6.fr/~bordes/mywiki/doku.php?id=sgdqn|LibSGDQN]]
+is available on
+[[http://webia.lip6.fr/~bordes/mywiki/doku.php|Antoine's]] web site.
+This source code comes with a script that replicates the
+experiments discussed in this paper.
+==== Appendix ====
+The appendix contains a derivation of upper and lower bounds
+on the asymptotic convergence speed of stochastic gradient algorithm.
+The constants are exact in the case of second order stochastic gradient.
+==== Errata ====
+The SGDQN algorithm as described in this paper contains a subtle flaw
+described in a subsequent [[:papers:bordes-2010|erratum]].
+There is a missing 1/2 factor in the bounds of theorem 1.
+\[
+ \def\w{\mathbf{w}}
+ {\frac{1}{2}} \frac{{\mathrm tr}(\mathbf{HBGB})}{2\lambda_{\max}-1}\,t^{-1} + {\mathrm o}(t^{-1})
+  ~\leq~ \mathbb{E}_{\sigma}\big[\:{\cal P}_n(\w_t)-{\cal P}_n(\w^*_n)\:\big] ~\leq~
+  {\frac{1}{2}} \frac{{\mathrm tr}(\mathbf{HBGB})}{2\lambda_{\min}-1}\,t^{-1} + {\mathrm o}(t^{-1})
+\]
+The version of the paper found on this site contains the correct theorem and proof.

User Tools

Site Tools

Differences

Page Tools