===== Curiously fast convergence of some stochastic gradient descent algorithms ===== This document reports an empirical finding about the convergence speed of stochastic gradient algorithms applied to finite training set and challenges the attendees of [[https://www.ceremade.dauphine.fr/SLDS2009|SLDS 2009]] with their theoretical implications. The stochastic gradient optimization theory states that the residual error decreases no faster than //t-1// when the examples are picked randomly with replacement. The main observation is that picking the examples with replacement (i.e. shuffling the data before each epoch and going sequentially) achieves a speed curiously close to //t-2//. To my knowledge (as of 2014), the convergence speed for SGD sampled without replacement has not been theoretically established. Although the open problems were not published in the [[summa-2011|SLDS 2009 conference book]], this finding was mentioned and replicated in the [[http://dl.acm.org/citation.cfm?id=1966812|journal]] [[http://www.cs.huji.ac.il/~shais/papers/ShalevSiSrCo10.pdf|version of the Pegasos paper]] (2011, section 7.5). This open problem was also mentioned by [[http://www.eecs.berkeley.edu/~brecht|Ben Recht]] during the [[http://www.ipam.ucla.edu/programs/sgm2014|IPAM Workshop on Stochastic Gradient Methods]] (2014) in relation with the much more impressive speeds achieved by recent algorithms for the optimization of finite sums ([[http://arxiv.org/abs/1309.2388|SAG]], [[http://papers.nips.cc/paper/4937-accelerating-stochastic-gradient-descent-using-predictive-variance-reduction|SVRG]].) New (Dec 2015):   [[https://asu.mit.edu/|Asu Ozdaglar]] gave a proof during her NIPS 2015 invited talk. ([[http://research.microsoft.com/apps/video/?id=259592|video recording ]]) ([[http://arxiv.org/abs/1510.08560|ArXiV]]) \\ Léon Bottou: //**Curiously fast convergence of some stochastic gradient descent algorithms**//, Unpublished open problem offered to the attendance of the SLDS 2009 conference, 2009. [[http://leon.bottou.org/publications/djvu/slds-2009.djvu|slds-2009.djvu]] [[http://leon.bottou.org/publications/pdf/slds-2009.pdf|slds-2009.pdf]] [[http://leon.bottou.org/publications/psgz/slds-2009.ps.gz|slds-2009.ps.gz]] @unpublished{bottou-slds-open-problem-2009, author = {Bottou, L\'{e}on}, title = {Curiously fast convergence of some stochastic gradient descent algorithms}, note = {Unpublished open problem offered to the attendance of the SLDS 2009 conference}, year = {2009}, url = {http://leon.bottou.org/papers/bottou-slds-open-problem-2009}, }