## Statistical Learning and Data Science

Editors: Mireille Gettler Summa, Léon Bottou, Bernard Goldfarb, Fionn Murtagh, Catherine Pardoux, Myriam Touati.

Statistical Learning and Data Science is a work of reference in the rapidly evolving context of converging methodologies. It gathers contributions from some of the foundational thinkers in the different fields of data analysis to the major theoretical results in the domain. On the methodological front, the volume includes conformal prediction and frameworks for assessing confidence in outputs, together with attendant risk. It illustrates a wide range of applications, including semantics, credit risk, energy production, genomics, and ecology. The book also addresses issues of origin and evolutions in the unsupervised data analysis arena, and presents some approaches for time series, symbolic data, and functional data.

Over the history of multidimensional data analysis, more and more complex data have become available for processing. Supervised machine learning, semi-supervised analysis approaches, and unsupervised data analysis, provide great capability for addressing the digital data deluge. Exploring the foundations and recent breakthroughs in the field, Statistical Learning and Data Science demonstrates how data analysis can improve personal and collective health and the well-being of our social, business, and physical environments.

Statistical Learning and Data Science, Edited by Mireille Gettler Summa, Léon Bottou, Bernard Goldfarb, Fionn Murtagh, Catherine Pardoux and Myriam Touati, CRC Computer Science & Data Analysis, Chapman & Hall, 2011.
@book{summa-2011,
editor = {Summa, Mireille Gettler and Bottou, L\'eon and Goldfarb, Bernard
and Murtagh, Fionn and Pardoux, Catherine and Touati, Myriam},
title = {Statistical Learning and Data Science},
publisher = {Chapman \& Hall},
year = {2011},
series = {CRC Computer Science \& Data Analysis},
url = {http://leon.bottou.org/papers/summa-2011},
}

### Contents

1. Mining on Social Networks.
Benjamin Chapus, Françoise Fogelman Soulié, Erik Marcadé, Julien Sauvage.
2. Large-Scale Machine Learning with Stochastic Gradient Descent.
Léon Bottou.
3. Fast Optimization Algorithms for Solving SVM+.
4. Conformal Predictors in Semi-Supervised Case.
Dmitry Adamskiy, Ilia Nouretdinov, Alexander Gammerman.
5. Some Properties of Infinite VC-Dimension Systems.
Alexey Chervonenkis.
6. Choriogenesis.
Jean-Paul Benzécri.
7. GDA in a Social Science Research Program: The Case of Bourdieu’s Sociology.
Frédéric Lebaron.
8. Semantics from Narrative: State of the Art and Future Prospects.
Fionn Murtagh, Adam Ganz, Joe Reddington.
9. Measuring Classifier Performance.
David J. Hand.
10. A Clustering Approach to Monitor System Working.
Alzennyr Da Silva, Yves Lechevallier, Redouane Seraoui.
11. Introduction to Molecular Phylogeny.
12. Bayesian analysis of Structural Equation Models using Parameter Expansion.
Séverine Demeyer, Jean-Louis Foulley, Nicolas Fischer, Gilbert Saporta.
13. Clustering Trajectories of a Three-Way Longitudinal Data Set.
Mireille Gettler Summa, Bernard Goldfarb, Maurizio Vichi.
14. Trees with Soft Nodes.
Antonio Ciampi.
15. Synthesis of Objects.
Myriam Touati, Mohamed Djedour, Edwin Diday.
16. Functional Data Analysis: An Interdisciplinary Statistical Topic.
Laurent Delsol, Frédéric Ferraty, Adela Martínez Calvo.
17. Methodological Richness of Functional Data Analysis.
Wenceslao Gonzàlez Manteiga, Philippe Vieu.

### Notes

Chapter 5 (Chervonenkis) contains a short proof that elucidates what happens when the uniform convergence does not take place, that is, when the entropy per example converges to a number c>0. It is then possible to identify an event with probability c for which the learning machine is non-falsifiable. This result has been mentioned in Statistical Learning Theory (Vapnik, 1998. theorem 3.6). To my knowledge, this is the first publication of the proof in English.