===== Curiously fast convergence of some stochastic gradient descent algorithms =====

This document reports an empirical finding about the convergence speed
of stochastic gradient algorithms applied to finite training set and 
challenges the attendees of [[https://www.ceremade.dauphine.fr/SLDS2009|SLDS 2009]] 
with their theoretical implications.  The stochastic gradient optimization theory
states that the residual error decreases no faster than //t<sup>-1</sup>// when
the examples are picked randomly with replacement. The main observation is that
picking the examples with replacement (i.e. shuffling the data before each epoch and
going sequentially) achieves a speed curiously close to //t<sup>-2</sup>//.
To my knowledge (as of 2014), the convergence speed for SGD sampled 
without replacement has not been theoretically established.
Although the open problems were not published in the 
[[summa-2011|SLDS 2009 conference book]], 
this finding was mentioned and replicated in the
[[http://dl.acm.org/citation.cfm?id=1966812|journal]]
[[http://www.cs.huji.ac.il/~shais/papers/ShalevSiSrCo10.pdf|version of the Pegasos paper]] (2011, section 7.5).
This open problem was also mentioned by [[http://www.eecs.berkeley.edu/~brecht|Ben Recht]] during the 
[[http://www.ipam.ucla.edu/programs/sgm2014|IPAM Workshop on Stochastic Gradient Methods]] (2014)
in relation with the much more impressive speeds achieved by recent algorithms
for the optimization of finite sums ([[http://arxiv.org/abs/1309.2388|SAG]],
 [[http://papers.nips.cc/paper/4937-accelerating-stochastic-gradient-descent-using-predictive-variance-reduction|SVRG]].)


<fc #ff0000>Update (Dec2015)</fc>: (nbsp) [[https://asu.mit.edu/|Asu Ozdaglar]] gave a proof during her NIPS 2015 talk. ([[http://research.microsoft.com/apps/video/?id=259592|video recording ]]) ([[http://arxiv.org/abs/1510.08560|ArXiV]])

\\


<box 99% orange>
Léon Bottou:  //**Curiously fast convergence of some stochastic gradient descent algorithms**//, 
Unpublished open problem offered to the attendance of the SLDS 2009 conference, 2009.

[[http://leon.bottou.org/publications/djvu/slds-2009.djvu|slds-2009.djvu]]
[[http://leon.bottou.org/publications/pdf/slds-2009.pdf|slds-2009.pdf]]
[[http://leon.bottou.org/publications/psgz/slds-2009.ps.gz|slds-2009.ps.gz]]
</box>

  @unpublished{bottou-slds-open-problem-2009,
    author = {Bottou, L\'{e}on},
    title = {Curiously fast convergence of some stochastic gradient descent algorithms},
    note = {Unpublished open problem offered to the attendance of the SLDS 2009 conference},
    year = {2009},
    url = {http://leon.bottou.org/papers/bottou-slds-open-problem-2009},
  }