leon.bottou.org
https://leon.bottou.org/
2024-03-18T22:02:31+00:00leon.bottou.org
https://leon.bottou.org/
https://leon.bottou.org/_media/favicon.icotext/html2023-12-19T11:44:34+00:00leonb (leonb@undisclosed.example.com)news:borges_and_ai
https://leon.bottou.org/news/borges_and_ai?rev=1703004274
Borges and AI
Léon Bottou and Bernhard Schölkopf
<https://arxiv.org/abs/2310.01425>
We started this work mid-2022. AI was already turning into a mainstream topic. Both as a scientist and a member of the society, I was troubled by the ambient confusion between the actual AI technology and the AI of our dreams or nightmares. We seem unable to grasp this technology and its impact without referring to an AI mythology that maybe starts with Homer'stext/html2023-09-14T20:23:35+00:00leonb (leonb@undisclosed.example.com)projects:neuristique
https://leon.bottou.org/projects/neuristique?rev=1694737415
Neuristique s.a.
Neuristique was founded in 1988 by a dozen friends with big dreams.
The mission statement was very long sentence that mentions
the application of artificial neural networks, the development of artificial brains,
and the exploration of space. We were very young and inexperienced.text/html2023-08-29T06:19:29+00:00leonb (leonb@undisclosed.example.com)papers - [2023]
https://leon.bottou.org/papers?rev=1693304369
Publications
Follow each publication link to access papers and supplemental data.
Most papers are available in DjVu, PDF, and PS.GZ.
Download a DjVu viewer.
2023
Alberto Bietti, Vivien Cabannes, Diane Bouchacourt, Herve Jegou and Leon Bottou: Birth of a Transformer: A Memory Viewpointtext/html2023-08-29T06:15:04+00:00leonb (leonb@undisclosed.example.com)papers:zhang-2023 - created
https://leon.bottou.org/papers/zhang-2023?rev=1693304104
Learning useful representations for shifting tasks and distributions
Abstract: Does the dominant approach to learn representations (as a side effect of optimizing an expected
cost for a single training distribution) remain a
good approach when we are dealing with multiple distributions? Our thesis is that such scenarios are better served by representations that are
richer than those obtained with a single optimization episode. We support this thesis with simple
theoretical arguments and with ex…text/html2023-08-29T06:12:20+00:00leonb (leonb@undisclosed.example.com)papers:rame-2023 - created
https://leon.bottou.org/papers/rame-2023?rev=1693303940
Model Ratatouille: Recycling Diverse Models for Out-of-Distribution Generalization
Abstract: Foundation models are redefining how AI systems are built. Practitioners now follow a standard procedure to build their machine learning
solutions: from a pre-trained foundation model,
they fine-tune the weights on the target task of
interest. So, the Internet is swarmed by a handful
of foundation models fine-tuned on many diverse
tasks: these individual fine-tunings exist in isolation without benefitin…text/html2023-08-29T06:08:41+00:00leonb (leonb@undisclosed.example.com)papers:balestriero-2022 - created
https://leon.bottou.org/papers/balestriero-2022?rev=1693303721
The Effects of Regularization and Data Augmentation are Class Dependent
Abstract: Regularization is a fundamental technique to prevent over-fitting and to improve generalization
performances by constraining a model’s complexity. Current Deep Networks heavily rely on regularizers such as Data-Augmentation (DA) or weight-decay, and employ structural risk minimization,
i.e. cross-validation, to select the optimal regularization hyper-parameters. In this study, we demonstrate that techniques such…text/html2023-08-29T06:05:37+00:00leonb (leonb@undisclosed.example.com)papers:zhang-2022 - created
https://leon.bottou.org/papers/zhang-2022?rev=1693303537
Rich feature construction for the optimization-generalization dilemma
Abstract: There often is a dilemma between ease of optimization and robust out-of-distribution (OoD)
generalization. For instance, many OoD methods
rely on penalty terms whose optimization is challenging. They are either too strong to optimize
reliably or too weak to achieve their goals.
In order to escape this dilemma, we propose to
first construct a rich representation (text/html2023-08-29T05:59:33+00:00leonb (leonb@undisclosed.example.com)papers:peysakhovich-2022 - created
https://leon.bottou.org/papers/peysakhovich-2022?rev=1693303173
Pseudo-Euclidean Attract-Repel Embeddings for Undirected Graphs
Abstract: Dot product embeddings take a graph and construct vectors for nodes such that dot products
between two vectors give the strength of the edge.
Dot products make a strong transitivity assumption, however, many important forces generating
graphs in the real world lead to non-transitive
relationships. We remove the transitivity assumption by embedding nodes into a pseudo-Euclidean
space - giving each node an attract and a r…text/html2023-08-29T05:54:47+00:00leonb (leonb@undisclosed.example.com)papers:defossez-2022
https://leon.bottou.org/papers/defossez-2022?rev=1693302887
A simple convergence proof of Adam and Adagrad
Abstract:
We provide a simple proof of convergence
covering both the Adam and Adagrad adaptive optimization algorithms when applied
to smooth (possibly non-convex) objective
functions with bounded gradients. We show
that in expectation, the squared norm of the
objective gradient averaged over the trajectory has an upper-bound which is explicit
in the constants of the problem, parameters
of the optimizer and the total number of iterations N. This bo…text/html2022-04-20T15:56:41+00:00leonb (leonb@undisclosed.example.com)talks:perceptrons
https://leon.bottou.org/talks/perceptrons?rev=1650484601
Perceptrons Revisited
This talk was given during the AAAI 2015 Spring Symposium on Knowledge Representation and Reasoning, whose theme was Integrating Symbolic and Neural Approaches. The idea of the talk is to re-read the classic 1968 book “Perceptrons