leon.bottou.org

news:borges_and_ai

leonb (leonb@undisclosed.example.com) — 2023-12-19T11:44:34+00:00

Borges and AI Léon Bottou and Bernhard Schölkopf We started this work mid-2022. AI was already turning into a mainstream topic. Both as a scientist and a member of the society, I was troubled by the ambient confusion between the actual AI technology and the AI of our dreams or nightmares. We seem unable to grasp this technology and its impact without referring to an AI mythology that maybe starts with Homer's

projects:neuristique

leonb (leonb@undisclosed.example.com) — 2023-09-14T20:23:35+00:00

Neuristique s.a. Neuristique was founded in 1988 by a dozen friends with big dreams. The mission statement was very long sentence that mentions the application of artificial neural networks, the development of artificial brains, and the exploration of space. We were very young and inexperienced.

papers - [2023]

leonb (leonb@undisclosed.example.com) — 2023-08-29T06:19:29+00:00

Publications Follow each publication link to access papers and supplemental data. Most papers are available in DjVu, PDF, and PS.GZ. Download a DjVu viewer. 2023 Alberto Bietti, Vivien Cabannes, Diane Bouchacourt, Herve Jegou and Leon Bottou: Birth of a Transformer: A Memory Viewpoint

papers:zhang-2023 - created

leonb (leonb@undisclosed.example.com) — 2023-08-29T06:15:04+00:00

Learning useful representations for shifting tasks and distributions Abstract: Does the dominant approach to learn representations (as a side effect of optimizing an expected cost for a single training distribution) remain a good approach when we are dealing with multiple distributions? Our thesis is that such scenarios are better served by representations that are richer than those obtained with a single optimization episode. We support this thesis with simple theoretical arguments and with ex…

papers:rame-2023 - created

leonb (leonb@undisclosed.example.com) — 2023-08-29T06:12:20+00:00

Model Ratatouille: Recycling Diverse Models for Out-of-Distribution Generalization Abstract: Foundation models are redefining how AI systems are built. Practitioners now follow a standard procedure to build their machine learning solutions: from a pre-trained foundation model, they fine-tune the weights on the target task of interest. So, the Internet is swarmed by a handful of foundation models fine-tuned on many diverse tasks: these individual fine-tunings exist in isolation without benefitin…

papers:balestriero-2022 - created

leonb (leonb@undisclosed.example.com) — 2023-08-29T06:08:41+00:00

The Effects of Regularization and Data Augmentation are Class Dependent Abstract: Regularization is a fundamental technique to prevent over-fitting and to improve generalization performances by constraining a model’s complexity. Current Deep Networks heavily rely on regularizers such as Data-Augmentation (DA) or weight-decay, and employ structural risk minimization, i.e. cross-validation, to select the optimal regularization hyper-parameters. In this study, we demonstrate that techniques such…

papers:zhang-2022 - created

leonb (leonb@undisclosed.example.com) — 2023-08-29T06:05:37+00:00

Rich feature construction for the optimization-generalization dilemma Abstract: There often is a dilemma between ease of optimization and robust out-of-distribution (OoD) generalization. For instance, many OoD methods rely on penalty terms whose optimization is challenging. They are either too strong to optimize reliably or too weak to achieve their goals. In order to escape this dilemma, we propose to first construct a rich representation (

papers:peysakhovich-2022 - created

leonb (leonb@undisclosed.example.com) — 2023-08-29T05:59:33+00:00

Pseudo-Euclidean Attract-Repel Embeddings for Undirected Graphs Abstract: Dot product embeddings take a graph and construct vectors for nodes such that dot products between two vectors give the strength of the edge. Dot products make a strong transitivity assumption, however, many important forces generating graphs in the real world lead to non-transitive relationships. We remove the transitivity assumption by embedding nodes into a pseudo-Euclidean space - giving each node an attract and a r…

papers:defossez-2022

leonb (leonb@undisclosed.example.com) — 2023-08-29T05:54:47+00:00

A simple convergence proof of Adam and Adagrad Abstract: We provide a simple proof of convergence covering both the Adam and Adagrad adaptive optimization algorithms when applied to smooth (possibly non-convex) objective functions with bounded gradients. We show that in expectation, the squared norm of the objective gradient averaged over the trajectory has an upper-bound which is explicit in the constants of the problem, parameters of the optimizer and the total number of iterations N. This bo…

talks:perceptrons

leonb (leonb@undisclosed.example.com) — 2022-04-20T15:56:41+00:00

Perceptrons Revisited This talk was given during the AAAI 2015 Spring Symposium on Knowledge Representation and Reasoning, whose theme was Integrating Symbolic and Neural Approaches. The idea of the talk is to re-read the classic 1968 book “Perceptrons