Abstract: Former experiments have shown the benefit of using specific multi-layer architectures, the so-called time delay neural networks, for phoneme recognition (Waibel, Hanazawa, Hinton, Shikano, & Lang, 1988). Similar experiments on a speaker-independent task were also performed on a small set of minimal pairs (Bottou, 1988). In this paper we focus on a speaker-independent, global word recognition task with time delay networks. We first describe these networks as a way for learning feature extractors by constrained back-propagation. Such a time-delay network is shown to be capable of dealing with a near real-sizedproblem: French digit recognition. The results are discussed and compared, on the same data sets, with those obtained with a classical time warping system.
@article{bottou-90, author = {Bottou, {L\'eon} and Fogelman Souli\'e, Fran\c{c}oise and Blanchet, Pascal and Lienard, {Jean Sylvain}}, title = {Speaker independent isolated digit recognition: Multilayer perceptron vs Dynamic Time Warping}, journal = {Neural Networks}, year = {1990}, volume = {3}, pages = {453-465}, url = {http://leon.bottou.org/papers/bottou-90}, }