leon.bottou.org
http://leon.bottou.org/
2017-08-18T12:05:58-04:00leon.bottou.org
http://leon.bottou.org/
http://leon.bottou.org/_media/favicon.icotext/html2017-08-16T17:18:54-04:00leonbstart
http://leon.bottou.org/start?rev=1502918334
Welcome
I am a research scientist with broad interests in practical and theoretical machine learning. My work on large scale learning and stochastic gradient algorithms has received attention in the recent years. I am also known for the DjVu document compression system.
I joined Facebook AI Research in March 2015.
Use the sidebar to navigate this site.text/html2017-08-10T15:18:31-04:00leonbnews:slds_2018 - created
http://leon.bottou.org/news/slds_2018?rev=1502392711
SLDS 2018
Announcing the Fourth Symposium on Statistical Learning and Data Science (SLDS), to be held at Science Po Paris, July 11-13 2018. The SLDS symposia are held once every three years and historically have had strong themes. Given the involvement of Science Po and CEVIPOF, the theme of the fourth edition is shaping to be “Data Science and Democracytext/html2017-08-10T15:14:49-04:00leonbnews:sciencepo.jpg - created
http://leon.bottou.org/?image=news%3Asciencepo.jpg&ns=news&rev=1502392489&do=media
<img src="http://leon.bottou.org/_media/news/sciencepo.jpg?w=500&h=296&t=1502392489&tok=644c07" alt="news:sciencepo.jpg" />text/html2017-06-22T17:04:22-04:00leonbpapers:sagun-2017 - created
http://leon.bottou.org/papers/sagun-2017?rev=1498165462
Empirical Analysis of the Hessian of Over-Parametrized Neural Networks
Abstract: We study the properties of common loss surfaces through their Hessian matrix. In particular, in the context of deep learning, we empirically show that the spectrum of the Hessian is composed of two parts: (1) the bulk centered near zero, (2) and outliers away from the bulk. We present numerical evidence and mathematical justifications to the following conjectures laid out by Sagun et al. [2016]: Fixing data, incre…text/html2017-06-22T17:03:23-04:00leonbpapers:sagun.png - created
http://leon.bottou.org/?image=papers%3Asagun.png&ns=papers&rev=1498165403&do=media
<img src="http://leon.bottou.org/_media/papers/sagun.png?w=499&h=349&t=1498165403&tok=ba10d4" alt="papers:sagun.png" />text/html2017-06-22T16:43:52-04:00leonbpapers:lafond-vasilache-bottou-2017
http://leon.bottou.org/papers/lafond-vasilache-bottou-2017?rev=1498164232
Diagonal Rescaling For Neural Networks
Abstract: We define a second-order neural network stochastic gradient training algorithm whose block-diagonal structure effectively amounts to normalizing the unit activations. Investigating why this algorithm lacks in robustness then reveals two interesting insights. The first insight suggests a new way to scale the stepsizes, clarifying popular algorithms such as RMSProp as well as old neural network tricks such as fanin stepsize scaling. The second insi…text/html2017-06-22T16:42:33-04:00leonbpapers:lopezpaz-2017
http://leon.bottou.org/papers/lopezpaz-2017?rev=1498164153
Discovering Causal Signals in Images
Abstract: This paper establishes the existence of observable foot- prints that reveal the “causal dispositions” of the object categories appearing in collections of images. We achieve this goal in two steps. First, we take a learning approach to observational causal discovery, and build a classifier that achieves state-of-the-art performance on finding the causal direction between pairs of random variables, given samples from their joint distribution. Secon…text/html2017-06-22T16:40:54-04:00leonbpapers:arjovsky-chintalah-bottou-2017 - created
http://leon.bottou.org/papers/arjovsky-chintalah-bottou-2017?rev=1498164054
Wasserstein Generative Adversarial Networks
Abstract: We introduce a new algorithm named WGAN, an alternative to traditional GAN training. In this new model, we show that we can improve the stability of learning, get rid of problems like mode collapse, and provide meaningful learning curves useful for debugging and hyperparameter searches. Furthermore, we show that the corresponding optimization problem is sound, and provide extensive theoretical work highlighting the deep connections to di…text/html2017-06-22T16:38:26-04:00leonbpapers:wgan.png - created
http://leon.bottou.org/?image=papers%3Awgan.png&ns=papers&rev=1498163906&do=media
<img src="http://leon.bottou.org/_media/papers/wgan.png?w=500&h=319&t=1498163906&tok=f854b9" alt="papers:wgan.png" />text/html2017-06-22T16:34:32-04:00leonbpapers:visualcausation.png - created
http://leon.bottou.org/?image=papers%3Avisualcausation.png&ns=papers&rev=1498163672&do=media
<img src="http://leon.bottou.org/_media/papers/visualcausation.png?w=500&h=307&t=1498163672&tok=4b22cd" alt="papers:visualcausation.png" />text/html2017-06-22T16:25:15-04:00leonbpapers:arjovsky-bottou-2017 - created
http://leon.bottou.org/papers/arjovsky-bottou-2017?rev=1498163115
Towards principled methods for training generative adversarial networks
Abstract: The goal of this paper is not to introduce a single algorithm or method, but to make theoretical steps towards fully understanding the training dynamics of generative adversarial networks. In order to substantiate our theoretical analysis, we perform targeted experiments to verify our assumptions, illustrate our claims, and quantify the phenomena. This paper is divided into three sections. The first section introd…text/html2017-06-22T16:22:28-04:00leonbpapers - [2016]
http://leon.bottou.org/papers?rev=1498162948
Publications
Follow each publication link to access papers and supplemental data.
Most papers are available in DjVu, PDF, and PS.GZ.
Download a DjVu viewer.
2017
Levent Sagun, Utku Evci, V. Uğur Güney, Yann Dauphin and Léon Bottou: Empirical Analysis of the Hessian of Over-Parametrized Neural Networkstext/html2017-02-10T10:15:06-04:00leonbbiography
http://leon.bottou.org/biography?rev=1486739706
Biography
Léon was born in 1965 in Saint Germain du Teil, Lozère
near the Aubrac plateau.
He spent his childhood in La Canourgue
and attended school in Rodez and Clermont-Ferrand,
then in École Sainte Geneviève, Versailles.
He received the Diplôme d'Ingénieur de l'École Polytechnique
(X84) in 1987,
the Magistère de Mathématiques Fondamentales et Appliquées et d'Informatique fromtext/html2017-01-18T09:44:15-04:00leonbprojects:lush - [Lush]
http://leon.bottou.org/projects/lush?rev=1484750655
Lush
Lush (<http://lush.sourceforge.net>) is an object-oriented programming language designed for researchers, experimenters, and engineers interested in large-scale numerical and graphic applications. It was designed to facilitate experimentation with multilayer neural networks and more generally to research what is now named deep learning. In general, Lush can is useful in situations where one would want to combine the flexibility of a high-level, weakly-typed interpreted language, with the …text/html2016-10-31T11:56:08-04:00leonbpapers:bromley-bentz-93
http://leon.bottou.org/papers/bromley-bentz-93?rev=1477929368
Signature Verification using a Siamese Time Delay Neural Network
Abstract: This paper describes the development of an algorithm for
verification of signatures written on a touch-sensitive pad.
The signature verification algorithm is based on an artificial neural network. The novel network
presented here, called atext/html2016-10-31T11:54:29-04:00leonbpapers:bottou-90
http://leon.bottou.org/papers/bottou-90?rev=1477929269
Speaker independent isolated digit recognition: Multilayer perceptron vs Dynamic Time Warping
Abstract: Former experiments have shown the benefit of using specific multi-layer architectures, the so-called
time delay neural networks, for phoneme recognition (Waibel, Hanazawa, Hinton, Shikano, & Lang, 1988).
Similar experiments on a speaker-independent task were also performed on a small set of minimal pairs (Bottou,
1988). In this paper we focus on a speaker-independent, global word recognition …text/html2016-10-31T11:52:56-04:00leonbpapers:driancourt-bottou-90
http://leon.bottou.org/papers/driancourt-bottou-90?rev=1477929176
TDNN-Extracted features
Abstract: Time Delay Neural Network (TDNN) is a technique, derived from MLP, which performs a time invariant
processing in its lowest layers. This time invariant processing may be extracted from the network, in order
to code the speech for an other classifier such as Dynamic Time Warping (DTW). The resulting hybrid
system shows improved performances, with respect to both techniques used in isolation.
This paper describes this technique, gives results on a multi-speaker, …text/html2016-10-31T11:51:07-04:00leonbpapers:bottou-89
http://leon.bottou.org/papers/bottou-89?rev=1477929067
Experiments with Time Delay Networks and Dynamic Time Warping for Speaker Independent Isolated Digit Recognition
Léon Bottou, Françoise. Fogelman Soulié, Pascal Blanchet and Jean Sylvain Lienard: Experiments with Time Delay Networks and Dynamic Time Warping for Speaker Independent Isolated Digit Recognitiontext/html2016-10-31T11:50:49-04:00leonbpapers:mejia-90
http://leon.bottou.org/papers/mejia-90?rev=1477929049
Galatea: A library for connectionist applications
Carlos Mejia, Léon Bottou and Françoise Fogelman Soulié: Galatea: A library for connectionist applications, Proceedings of the International Neural Networks Conference, INNC'90, 1:9-13, Paris, 1990.text/html2016-10-31T11:49:16-04:00leonbpapers:bottou-88b
http://leon.bottou.org/papers/bottou-88b?rev=1477928956
Reconnaissance de la parole par reseaux connexionnistes
This is my first neural network paper.
It describes the application of a time-delay neural network (TDNN) to the recognition of isolated word in speech signal, with performance comparable to LIMSI's state-of-the-art dynamic time warping (DTW) method. Besides describing one of the first subsampled convolutional neural network, this paper describes how to correctly initialize the weights (page 9) and performs data augmentation with elastic t…