### Table of Contents

# Structured Learning Systems

## Speech Recognition

I started working on structured learning systems in the
context of my Ph.D. thesis work on speech recognition.
I first focused on combination of time-delay neural networks
and dynamic programming techniques. When Bourlard and Wellekens
published their first paper combining HMMs and neural networks,
I realized that the HMM framework provided much greater opportunities
to approach the problem. It also led to the concept of *“global training”*.
This was developed in my 1991 Ph.D. thesis,
where I also identified a curious modeling problem that was
later termed *“the label bias problem”*.

## Graph Transformer Networks

Attempts to solve the label bias problem led to the non-probabilistic (LVQ-based) approach described in the very last paragraph of the IJCNN 1991 paper. Meanwhile Burges and Denker solved the probabilistic puzzle in 1994, leading to the document analysis systems and graph transformer network work. To approach this work, I would first recommend reading the 1996 draft which I find much clearer than the published papers (see "Graph Transducer Networks explained".)

## A Broader Perspective

My point of view evolved dramatically around 2010 when I started rethinking the connections between structured learning and the emerging deep learning methods. The continuation of this research work can be found in the section on Machine Reasoning and Machine Learning.

## Tutorials

- The slides about Graph Transformer Networks.
- The tutorial Energy Based Learning by Yann LeCun and his NYU collaborators.

## Publications

**A Framework for the Cooperation of Learning Algorithms**,

*Advances in Neural Information Processing Systems*, 3, Edited by D. Touretzky and R. Lippmann, Morgan Kaufmann, Denver, 1991.

*, Orsay, France, 1991.*

**Une Approche théorique de l'Apprentissage Connexionniste: Applications à la Reconnaissance de la Parole**

**Learning Vector Quantization, Multi Layer Perceptron and Dynamic Time Warping: Comparison and Cooperation**,

*Proceedings of the International Joint Conference on Neural Networks*, Seattle, 1991.

*, July 1996.*

**Draft report: Document Analysis with Transducers**

**Global Training of Document Processing Systems using Graph Transformer Networks**,

*Proc. of Computer Vision and Pattern Recognition*, 489-493, IEEE, Puerto-Rico, 1997.

**Reading Checks with graph transformer networks**,

*International Conference on Acoustics, Speech, and Signal Processing*, 1:151-154, IEEE, Munich, 1997.

**Gradient Based Learning Applied to Document Recognition**,

*Proceedings of IEEE*, 86(11):2278-2324, 1998.

**Object Recognition with Gradient-Based Learning**,

*Feature Grouping*, Edited by David Forsyth, Springer Verlag, 1999.

**Graph Transformer Networks for Image Recognition**,

*Bulletin of the International Statistical Institute (ISI)*, 2005.