<?xml version="1.0" encoding="UTF-8"?>
<!-- generator="FeedCreator 1.8" -->
<?xml-stylesheet href="https://leon.bottou.org/lib/exe/css.php?s=feed" type="text/css"?>
<rdf:RDF
    xmlns="http://purl.org/rss/1.0/"
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
    xmlns:dc="http://purl.org/dc/elements/1.1/">
    <channel rdf:about="https://leon.bottou.org/feed.php">
        <title>leon.bottou.org</title>
        <description></description>
        <link>https://leon.bottou.org/</link>
        <image rdf:resource="https://leon.bottou.org/_media/logo.png" />
       <dc:date>2026-04-07T10:02:38+00:00</dc:date>
        <items>
            <rdf:Seq>
                <rdf:li rdf:resource="https://leon.bottou.org/papers/siamnews-2025?rev=1773776704"/>
                <rdf:li rdf:resource="https://leon.bottou.org/papers/bottou-schoelkopf-2023?rev=1773776539"/>
                <rdf:li rdf:resource="https://leon.bottou.org/papers?rev=1773776183"/>
                <rdf:li rdf:resource="https://leon.bottou.org/papers/zhang-bottou-2025?rev=1773775533"/>
                <rdf:li rdf:resource="https://leon.bottou.org/papers/chen-2025?rev=1773775419"/>
                <rdf:li rdf:resource="https://leon.bottou.org/papers/zhang-2025?rev=1773774997"/>
                <rdf:li rdf:resource="https://leon.bottou.org/papers/cabannes-2023?rev=1773774619"/>
                <rdf:li rdf:resource="https://leon.bottou.org/papers/bietti-2023?rev=1773774560"/>
            </rdf:Seq>
        </items>
    </channel>
    <image rdf:about="https://leon.bottou.org/_media/logo.png">
        <title>leon.bottou.org</title>
        <link>https://leon.bottou.org/</link>
        <url>https://leon.bottou.org/_media/logo.png</url>
    </image>
    <item rdf:about="https://leon.bottou.org/papers/siamnews-2025?rev=1773776704">
        <dc:format>text/html</dc:format>
        <dc:date>2026-03-17T19:45:04+00:00</dc:date>
        <dc:creator>leonb (leonb@undisclosed.example.com)</dc:creator>
        <title>siamnews-2025</title>
        <link>https://leon.bottou.org/papers/siamnews-2025?rev=1773776704</link>
        <description>This short position paper argues that LLMs are best viewed as Fiction Machines, that machines able to write stories that might not be related to what is factual but are internally coherent. Considerable efforts are spent aligning these machines with our expectations, chasing hallucinations, and ignoring the fact that the ability to make up stories is key to intelligence.</description>
    </item>
    <item rdf:about="https://leon.bottou.org/papers/bottou-schoelkopf-2023?rev=1773776539">
        <dc:format>text/html</dc:format>
        <dc:date>2026-03-17T19:42:19+00:00</dc:date>
        <dc:creator>leonb (leonb@undisclosed.example.com)</dc:creator>
        <title>bottou-schoelkopf-2023 - created</title>
        <link>https://leon.bottou.org/papers/bottou-schoelkopf-2023?rev=1773776539</link>
        <description>Abstract:
Abstract:  “Many believe that Large Language Models (LLMs) open the era of Artificial Intelligence (AI). Some see opportunities while others see dangers. Yet both proponents and opponents grasp AI through the imagery popularised by science fiction. Will the machine become sentient and rebel against its creators? Will we experience a paperclip apocalypse? Before answering such questions, we should first ask whether this mental imagery provides a good description of the phenomenon at han…</description>
    </item>
    <item rdf:about="https://leon.bottou.org/papers?rev=1773776183">
        <dc:format>text/html</dc:format>
        <dc:date>2026-03-17T19:36:23+00:00</dc:date>
        <dc:creator>leonb (leonb@undisclosed.example.com)</dc:creator>
        <title>papers</title>
        <link>https://leon.bottou.org/papers?rev=1773776183</link>
        <description>Publications

Follow each publication link to access papers and supplemental data.

Most papers are available in DjVu, PDF, and PS.GZ.

	*  Download a DjVu viewer.
	*  Link to my page on Google scholar

2026

... in progress ...

2025
Jingtong Su, Jianyu Zhang, Karen Ullrich, Léon Bottou and Mark Ibrahim:</description>
    </item>
    <item rdf:about="https://leon.bottou.org/papers/zhang-bottou-2025?rev=1773775533">
        <dc:format>text/html</dc:format>
        <dc:date>2026-03-17T19:25:33+00:00</dc:date>
        <dc:creator>leonb (leonb@undisclosed.example.com)</dc:creator>
        <title>zhang-bottou-2025</title>
        <link>https://leon.bottou.org/papers/zhang-bottou-2025?rev=1773775533</link>
        <description>Memory Mosaics at Scale

Abstract:
Memory Mosaics, networks of associative memories, have
demonstrated appealing compositional and in-context learning capabilities on
medium-scale networks (GPT-2 scale) and synthetic small datasets. This work
shows that these favorable properties remain when we scale memory mosaics to
large language model sizes (llama-8B scale) and real-world datasets.
To this end, we scale memory mosaics to 10B size, we train them on one trillion
tokens, we introduce a couple a…</description>
    </item>
    <item rdf:about="https://leon.bottou.org/papers/chen-2025?rev=1773775419">
        <dc:format>text/html</dc:format>
        <dc:date>2026-03-17T19:23:39+00:00</dc:date>
        <dc:creator>leonb (leonb@undisclosed.example.com)</dc:creator>
        <title>chen-2025 - created</title>
        <link>https://leon.bottou.org/papers/chen-2025?rev=1773775419</link>
        <description>MagicPIG: LSH Sampling for Efficient LLM Generation

Abstract:
Large language models (LLMs) with long context windows have gained significant
attention. However, the KV cache, stored to avoid re-computation, becomes
a bottleneck. Various dynamic sparse or TopK-based attention approximation
methods have been proposed to leverage the common insight that attention is
sparse. In this paper, we first show that TopK attention itself suffers from quality
degradation in certain downstream tasks because …</description>
    </item>
    <item rdf:about="https://leon.bottou.org/papers/zhang-2025?rev=1773774997">
        <dc:format>text/html</dc:format>
        <dc:date>2026-03-17T19:16:37+00:00</dc:date>
        <dc:creator>leonb (leonb@undisclosed.example.com)</dc:creator>
        <title>zhang-2025</title>
        <link>https://leon.bottou.org/papers/zhang-2025?rev=1773774997</link>
        <description>Memory Mosaics

Abstract: Memory Mosaics are networks of associative memories working in concert to
achieve a prediction task of interest. Like transformers, memory mosaics possess
compositional capabilities and in-context learning capabilities. Unlike transformers, memory mosaics achieve these capabilities in comparatively transparent way
(“predictive disentanglement”). We illustrate these capabilities on a toy example
and also show that memory mosaics perform as well or better than transformer…</description>
    </item>
    <item rdf:about="https://leon.bottou.org/papers/cabannes-2023?rev=1773774619">
        <dc:format>text/html</dc:format>
        <dc:date>2026-03-17T19:10:19+00:00</dc:date>
        <dc:creator>leonb (leonb@undisclosed.example.com)</dc:creator>
        <title>cabannes-2023</title>
        <link>https://leon.bottou.org/papers/cabannes-2023?rev=1773774619</link>
        <description>Active Self-Supervised Learning: A Few Low-Cost Relationships Are All You Need

Abstract:
Self-Supervised Learning (SSL) has emerged as the solution of 
choice to learn transferable representations from
unlabeled data. However, SSL requires to build samples that are 
known to be semantically akin, i.e. positive views. Requiring 
such knowledge is the main limitation of SSL and is often tackled by ad-hoc strategies
e.g. applying known data-augmentations to the same input. 
In this work, we formal…</description>
    </item>
    <item rdf:about="https://leon.bottou.org/papers/bietti-2023?rev=1773774560">
        <dc:format>text/html</dc:format>
        <dc:date>2026-03-17T19:09:20+00:00</dc:date>
        <dc:creator>leonb (leonb@undisclosed.example.com)</dc:creator>
        <title>bietti-2023</title>
        <link>https://leon.bottou.org/papers/bietti-2023?rev=1773774560</link>
        <description>Birth of a Transformer: A Memory Viewpoint

Abstract:
Large language models based on transformers have achieved great empirical successes. 
However, as they are deployed more widely, there is a growing need to
better understand their internal mechanisms in order to make them more reliable.
These models appear to store vast amounts of knowledge from their training data,
and to adapt quickly to new information provided in their context or prompt. We
study how transformers balance these two types o…</description>
    </item>
</rdf:RDF>
