leon.bottou.org

Ph. D. Dissertation

The long title translates as A theoretical approach to connectionist learning, with applications for speech recognition. This is a relatively big document with a number of relatively new ideas for the time.

Connectionist learning algorithms can be studied as stochastic approximations and more specifically stochastic gradient descent algorithms, possibly involving surrogate loss functions (chapters 2 and 3).
The performance of these algorithms can be studied using advanced statistical theories such as Vladimir Vapnik's Structural Risk Minimization (chapter 4).
The speed of these algorithms can be approached using the methods of numerical optimization (chapter 5.)
Complex applications, such as speech recognition, can be addressed using modular learning systems such as Discriminant Hidden Markov Models (chapter 8 and 9).
Probabilistic discriminant modular learning systems are often limited by the so-called label-bias problem (chapter 10). Finding a solution took a three more years.

Léon Bottou: Une Approche théorique de l'Apprentissage Connexionniste: Applications à la Reconnaissance de la Parole, Ph.D. thesis, Université de Paris XI, Orsay, France, 1991.

bottou-1991.djvu (one big file) bottou-1991.pdf bottou-1991.ps.gz
bottou-1991.djvu (served page per page)

@phdthesis{bottou-91a,
  title = {Une Approche th\'eorique de l'Apprentissage Connexionniste: Applications \`a la Reconnaissance de la Parole},
  author = {Bottou, {L\'eon}},
  year = {1991},
  school = {Universit\'{e} de Paris XI},
  address = {Orsay, France},
  url = {http://leon.bottou.org/papers/bottou-91a},
}