The long title translates as
*A theoretical approach to connectionist learning,
with applications for speech recognition*.
This is a relatively big document with a number of
relatively new ideas for the time.

- Connectionist learning algorithms can be studied as stochastic approximations and more specifically stochastic gradient descent algorithms, possibly involving surrogate loss functions (chapters 2 and 3).
- The performance of these algorithms can be studied using advanced statistical theories such as Vladimir Vapnik's Structural Risk Minimization (chapter 4).
- The speed of these algorithms can be approached using the methods of numerical optimization (chapter 5.)
- Complex applications, such as speech recognition, can be addressed using modular learning systems such as Discriminant Hidden Markov Models (chapter 8 and 9).
- Probabilistic discriminant modular learning systems are often limited by the so-called
*label-bias problem*(chapter 10). Finding a solution took a few more years.

Léon Bottou: **Une Approche théorique de l'Apprentissage Connexionniste: Applications à la Reconnaissance de la Parole**, Ph.D. thesis, Université de Paris XI, Orsay, France, 1991.

bottou-1991.djvu (one big file)
bottou-1991.pdf
bottou-1991.ps.gz

bottou-1991.djvu (served page per page)

@phdthesis{bottou-91a, title = {Une Approche th\'eorique de l'Apprentissage Connexionniste: Applications \`a la Reconnaissance de la Parole}, author = {Bottou, {L\'eon}}, year = {1991}, school = {Universit\'{e} de Paris XI}, address = {Orsay, France}, url = {http://leon.bottou.org/papers/bottou-91a}, }