This document reports an empirical finding about the convergence speed
of stochastic gradient algorithms applied to finite training set and
challenges the attendees of SLDS 2009
with their theoretical implications. The stochastic gradient optimization theory
states that the residual error decreases no faster than *t ^{-1}* when
the examples are picked randomly with replacement. The main observation is that
picking the examples with replacement (i.e. shuffling the data before each epoch and
going sequentially) achieves a speed curiously close to

New (Dec 2015): Asu Ozdaglar gave a proof during her NIPS 2015 invited talk. (video recording ) (ArXiV)

Léon Bottou: **Curiously fast convergence of some stochastic gradient descent algorithms**,
Unpublished open problem offered to the attendance of the SLDS 2009 conference, 2009.

@unpublished{bottou-slds-open-problem-2009, author = {Bottou, L\'{e}on}, title = {Curiously fast convergence of some stochastic gradient descent algorithms}, note = {Unpublished open problem offered to the attendance of the SLDS 2009 conference}, year = {2009}, url = {http://leon.bottou.org/papers/bottou-slds-open-problem-2009}, }

papers/bottou-slds-open-problem-2009.txt · Last modified: 2016/02/10 13:18 by leonb